Archived Forum Post

Index of archived forum posts

Question:

CkZip put_OemCodePage with utf-16 truncates files in zip file

Sep 01 '15 at 01:47

Hi,
After upgrading to "Chilkat C/C++ Libs for VC++ 2015 / win32", we have started to experience some odd behaviour regarding zipping of files (zip.AppendFiles...)
Prior to the zipping we set the codepage to utf-16 (1200). What we experience is that all the files in the zip are truncated. E.g. "MyPicture.jpg" gets truncated to "MyPicture.jp"... This is 100% consistent for all the files.

We don't have this problem if we switch to utf-8.
Running same codes from Visual Studio 2013 (on same computer) with the old chilkat (prior to 2015), does not yield this problem.

Any idea what could cause this ?

Codes

...
zip.put_OemCodePage(1200);  // UTF-16
if (!zip.AppendFiles(pszFilesToAdd, true))
{
// error handling (we don't get here in our example)
}
CkString strError;
zip.LastErrorText(strError);
AfxMessageBox(strError);

This yields the following "LastErrorText" (i.e. no error as far as I can see)
ChilkatLog:
  AppendFiles:
    DllDate: Jun 23 2015
    ChilkatVersion: 9.5.0.51
    UnlockPrefix: ABCD
    Username: XXPC:YY
    Architecture: Little Endian; 32-bit
    Language: Visual C++ 12.0 (32-bit)
    VerboseLogging: 0
    appendFileEx:
      FilePattern: C:TempJustATest.psn
      AppendFromDir: .
      PathPrefix: 
      BaseDir: C:TempJustATest.psn
      InzipBase: 
      FilenamePart: *
      IsSpecificFile: 0
      recurse: 1
      saveExtraPath: 0
      archiveOnly: 0
      includeHidden: 1
      includeSystem: 1
      ignoreAccessDenied: 1
      No exclusion patterns.
      numAdded: 174
    --appendFileEx
    Success.
  --AppendFiles
--ChilkatLog

Thanks in advance.
Hakon


Answer

I'll have a look tomorrow...


Answer

I reproduced the problem, and I'll work to fix it. However, I suspect it's not a good idea to be using utf-16 in a .zip anyway. I tested opening the .zip what was written using 7-Zip, and the result is as I expected -- each entry only shows the 1st char. This is because a typical zip implementation would be expecting a multibyte charset encoding (i.e. nothing with embedded nulls) and the embedded nulls within utf-16 are unintentionally seen as string terminators. I suspect a lot of zip implementations will have trouble with it..