Archived Forum Post

Index of archived forum posts

Question:

CkZip put_OemCodePage with utf-16 truncates files in zip file

Sep 01 '15 at 01:47

Hi,
After upgrading to "Chilkat C/C++ Libs for VC++ 2015 / win32", we have started to experience some odd behaviour regarding zipping of files (zip.AppendFiles...)
Prior to the zipping we set the codepage to utf-16 (1200). What we experience is that all the files in the zip are truncated. E.g. "MyPicture.jpg" gets truncated to "MyPicture.jp"... This is 100% consistent for all the files.

We don't have this problem if we switch to utf-8.
Running same codes from Visual Studio 2013 (on same computer) with the old chilkat (prior to 2015), does not yield this problem.

Any idea what could cause this ?

Codes

...
zip.put_OemCodePage(1200);  // UTF-16
if (!zip.AppendFiles(pszFilesToAdd, true))
{
// error handling (we don't get here in our example)
}
CkString strError;
zip.LastErrorText(strError);
AfxMessageBox(strError);

This yields the following "LastErrorText" (i.e. no error as far as I can see)

ChilkatLog: AppendFiles: DllDate: Jun 23 2015 ChilkatVersion: 9.5.0.51 UnlockPrefix: ABCD Username: XXPC:YY Architecture: Little Endian; 32-bit Language: Visual C++ 12.0 (32-bit) VerboseLogging: 0 appendFileEx: FilePattern: C:TempJustATest.psn AppendFromDir: . PathPrefix: BaseDir: C:TempJustATest.psn InzipBase: FilenamePart: * IsSpecificFile: 0 recurse: 1 saveExtraPath: 0 archiveOnly: 0 includeHidden: 1 includeSystem: 1 ignoreAccessDenied: 1 No exclusion patterns. numAdded: 174 --appendFileEx Success. --AppendFiles --ChilkatLog

Thanks in advance.
Hakon

Answer

I'll have a look tomorrow...

Answer

I reproduced the problem, and I'll work to fix it. However, I suspect it's not a good idea to be using utf-16 in a .zip anyway. I tested opening the .zip what was written using 7-Zip, and the result is as I expected -- each entry only shows the 1st char. This is because a typical zip implementation would be expecting a multibyte charset encoding (i.e. nothing with embedded nulls) and the embedded nulls within utf-16 are unintentionally seen as string terminators. I suspect a lot of zip implementations will have trouble with it..