Archived Forum Post

Index of archived forum posts

Question:

CkString getEnc("utf-8") strips trailing double-quote

Mar 31 '14 at 03:04

After upgrading from 9.4.1.68 to 9.5.0.16, I am experiencing a strange issue with CkString's getEnc() method. If the CkString contains a UTF-8 encoded string and I use getEnc("utf-8") to get the string, any trailing double-quote sign will be stripped. I know that it should not be necessary to use getEnc() in this case, but I have my reasons and this is simply an example. If I use getUtf8() that should be the same as getEnc("utf-8"), the returned string keeps the trailing double-quote sign.

Here is an example:

CkString cks; cks.appendUtf8("\"123\"");
const char *str = cks.getEnc("utf-8");
const char *str2 = cks.getUtf8();

The content of str will be "123, but the content of str2 will be "123". They should be identical.

This worked fine in 9.4.1.68. I could simply rewrite my code, but it looks like there is an error in getEnc() with UTF-8. I should mention that if I do the exact same test using ANSI only, the error does not happen.

Accepted Answer

Thanks. The problem was found and fixed. However, it is only a problem when "utf-8" is the argument to getEnc because this is the value that internally needs no conversion (because the string is stored internally as utf-8). The error was caused by the incorrect expectation that a buffer would contain the terminating null but it did not.

The workaround, until the next "SP1" version release, is to call getUtf8() instead of getEnc("utf-8"). Any other charset passed to getEnc won't cause the problem.

Answer

Unfortunately it appears that CkHttpRequest's LoadBodyFromString(const char bodyStr, const char charset) method has the same issue while using .put_Utf8(true) and setting the output "charset" to "utf-8" as well, e.g.:

CkHttpRequest req;
req.put_Utf8(true);
req.LoadBodyFromString("{ utf-8 encoded json string }", "utf-8");
...
CkHttpResponse *resp = http.SynchronousRequest(domain, port, ssl, req);

Monitoring the http request with e.g. Fiddler2 shows that the trailing bracket in the example above, gets stripped away. Again this worked with 9.4.x so I guess that the issue is related or perhaps even caused by the CkString issue?

If I do not specify put_Utf8(true) it works fine, but since the input string is utf-8 encoded, so I need to specify it.

Answer

Thanks! Chilkat will re-release the v9.5.0 C++ libs, adding a micro-version (or patch number) which will be something like "v9.5.0.21".

Answer

I see that newest download is 9.5.0.21, but you have not posted that is was released (You only wrote that you would re-release, but not when)?

Is 9.5.0.21 safe to use?

I noticed that code comments in some cases have character/encoding errors (e.g. CkHttp.h line 1776: Amazon"™s instead of Amazon's). I am quite frankly afraid to use the new version, since proper and trusted encoding is my main concern here.