Archived Forum Post

Index of archived forum posts

Question:

How to deal with Non-US-ASCII Characters in Filenames?

Dec 28 '14 at 20:50

Hi there, I am trying do write a program with the help of Chilkat Zip C/C++ Library. I have a question when I try to zip files with Chinese characters in filenames.

I saw some solution here use

               zip.put_OemCodePage(65001)

firstly and then use

               zip.AppendFiles(Directory, recurse)

However, I tried this but it didn't work. Even if it worked, I would prefer to find a way working on the file instead of the directory, since most of the time the directory contains some other files which should not be zipped. For example, suppose during my program I create two files under directory C:temp\

               file1.txt   , 文件1.doc

And only 文件1.doc should be zipped.

Really appreciate anyone who gives some ideas. Thanks very much!


Accepted Answer

If using AppendFiles to add files to a .zip, then all filenames should be correctly added because internally, Chilkat works in utf-8.

If the OemCodePage property is set to 65001 (which is the code page value for utf-8), then Chilkat Zip will correctly produce a .zip with utf-8 filenames. I have no doubt that this is working correctly.

Problems can arise if:

  1. The filesystem itself does not support Unicode filenames.
  2. A zip utility or other zip software does not support reading .zip archives having utf-8 filenames. (For example, WinZip added support for Unicode (utf-8) filenames starting with version 11.2. Some zip software still exists that doesn't support utf-8 filenames).

Passing Unicode between your application and Chilkat is possible in two ways: Either through the Unicode C++ API (wchar_t) or through the multibyte API (which you are currently using) with utf-8.

To use wchar_t, use the Chilkat classes ending with "W". See here: http://www.chilkatsoft.com/refdoc/wcppCkZipWRef.html

To pass utf-8 strings to the Chilkat multibyte API (and to tell Chilkat to return utf-8), use the "Utf8" property. See here: http://www.cknotes.com/utf8-c-property-allows-for-utf-8-or-ansi-const-char/


Answer

PS> it is also possible to append individual files to a .zip via the AppendOneFileOrDir method.