login about faq

Whenever I use CkZip and CkFileAccess I wonder what is the enconding of the const char * behind methods like:

bool OpenForRead(const char *filePath);
bool OpenZip(const char *ZipFileName);

I want to make sure that on Windows that translates into a Unicode path, but I am not sure how Chilkat handles these.

What is the expected encoding of the input and does Chilkat convert to Unicode/UTF-16 underlying on Windows? Does it use the W (CreateFileW) versions of the APIs when opening files on Windows?

asked Oct 06 at 08:54

Bogdan's gravatar image

Bogdan
11


Answering my own question: just remembered about the Unicode versions of CkZip and CkFileAccess: CkZipW and CkFileAccessW

A reminder on the doc page of CkZip and CkFileAccess about their Unicode versions would probably be a good addition to the documentation.

link

answered Oct 06 at 09:08

Bogdan's gravatar image

Bogdan
11

edited Oct 06 at 09:09

Bogdan,

For the multibyte Chilkat C/C++ API (i.e. "CkZip" instead of "CkZipW"), strings are passed as "const char *", and the lower case alternative methods (those that return strings) return "const char *".

By default, "const char *" is assumed to point to ANSI bytes (i.e. chars that use the ANSI encoding, which is typically a 1-byte per char encoding).

All Chilkat classes, except for CkString, have a Utf8 property (get_Utf8(), put_Utf8(bool value)) This controls how Chilkat is to interpret the bytes of a "const char *" -- as utf-8 or ANSI. By default, the Utf8 property is false. If Utf8 is true, then "const char *" inputs are interpreted as utf-8, and "const char *" returned by the lowercase string method alternatives will be in utf-8.

See http://cknotes.com/utf8-c-property-allows-for-utf-8-or-ansi-const-char/

Also, an application can set the ANSI/utf-8 behavior globally by setting the CkGlobal's DefaultUtf8 property.

Finally: Internally, Chilkat will use Unicode (such as CreateFileW) to open files where the path contains any non-us-ascii chars. If the path is entirely 7bit us-ascii, Chilkat is free to use either CreateFileA or CreateFileW, it really doesn't matter. This applies to not just CreateFileW, but any Microsoft Platform SDK method that has ANSI(A) or Unicode(W) alternatives.

link

answered Oct 06 at 09:38

chilkat's gravatar image

chilkat ♦♦
11.8k316358420

edited Oct 06 at 09:38

Thank you for the quick reply, Matt!

Finally: Internally, Chilkat will use Unicode (such as CreateFileW) to open files where the path contains any non-us-ascii chars. If the path is entirely 7bit us-ascii, Chilkat is free to use either CreateFileA or CreateFileW, it really doesn't matter. This applies to not just CreateFileW, but any Microsoft Platform SDK method that has ANSI(A) or Unicode(W) alternatives.

Does this mean that on Windows Chilkat decides whether to use CreateFileA or CreateFileW instead of always using CreateFileW?

I ask because all ANSI(A) alternatives of the Microsoft Platform SDK internally convert strings from ANSI to Unicode and then call the Unicode(W) version of the API, so I wonder why it makes sense for Chilkat to ever call the ANSI(A) versions.

For instance, here is how CreateFileA is implemented:

uf /c KERNELBASE!CreateFileA
KERNELBASE!CreateFileA
  KERNELBASE!CreateFileA+0x1f:
    call to KERNELBASE!Basep8BitStringToDynamicUnicodeString
  KERNELBASE!CreateFileA+0x58:
    call to KERNELBASE!CreateFileW
  KERNELBASE!CreateFileA+0x65:
    call to ntdll!RtlFreeAnsiString
link

answered Oct 06 at 09:53

Bogdan's gravatar image

Bogdan
11

edited Oct 06 at 10:04

The older Windows Mobile (CE) builds don't have the Unicode Platform SDK functions available, so like I said.. Chilkat can choose one or the other. The choice depends on what Chilkat already has. If it has utf-8 or utf-16 in hand, then it might choose CreateFileW. If it has ANSI in hand, the it makes no difference whether Chilkat first coverts to Unicode and then calls CreateFileW, or whether it just calls CreateFileA and lets Microsoft convert.

I don't like to discuss internals because I don't have the time to defend the internal implementation. It is a mature implementation that has evolved over the last 16 years...

link

answered Oct 06 at 10:12

chilkat's gravatar image

chilkat ♦♦
11.8k316358420

Thank you for the details, Matt. No need to defend the internal implementation, I trust Chilkat to do the right thing here. I was just curious and willing to learn something new about the reasoning behind it. I found on several occasions that understanding how Chilkat works helped with using it the way it was intended to, thus making the best use of these excellent components.

link

answered Oct 06 at 11:40

Bogdan's gravatar image

Bogdan
11

edited Oct 06 at 11:45

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×13

Asked: Oct 06 at 08:54

Seen: 167 times

Last updated: Oct 06 at 11:45

powered by OSQA