In C# I am using the code below which works fine for ASCII characters but returns a null value if the UnsecuredString in the crypt.EncryptStringENC contains UTF-8 characters such as Հայաստանի Հա
Is there a workaround to this issue?
Here is the code:
asked Jul 19 '12 at 09:46
The answer to your question requires that you first understand the fundamental difference between a "string" data type in a language such as C#, and the byte representation of a string for a given character encoding (i.e. charset).
In C#, the string type represents a string of Unicode characters. (string is an alias for System.String in the .NET Framework.) Notice the use of the word "characters" (not "bytes"). The methods of the string class in C# (and in any programming language that has a "string" class) are such that they hide the underlying byte representation of the characters. There are methods to get the length (in characters, not bytes), to get the Nth character, to append, prepend, find sub-strings, or display the glyphs in a user interface control, such as a text box. All of these methods are designed to not care about the underlying/internal byte representation.
However, as soon as you want to write the string to a file, the byte representation of the characters becomes important. The same goes for reading a string from a file. For example, if (in C#) you can write a string to a file via the System.IO.WriteAllText method. There are two overloads for this method:
// This first one does not specify the character encoding because it assumes the utf-8 encoding:
public static void WriteAllText(
// The 2nd overload allows you to explicitly specify the character encoding:
This is important because any given character can have different byte representations based on the character encoding.
For example consider this character: É
In the iso-8859-1 character encoding, it is represented by a single byte: 0xC9
If a program writes a file containing "É" using the utf-8 encoding, and another program reads the file but instead interprets the bytes according to the iso-8859-1 encoding, the result would be garbage.
Now.. to finally answer your question: Encryption and decryption algorithms operate on bytes. (Just like file reading and writing -- ultimately it is bytes that are read/written.) So the question is: What byte representation of the "string" is being encrypted? Is it utf-8? ANSI? (more about ANSI below), ucs-2? or something else?
The Chilkat.Crypt2.Charset property controls the character encoding used for the byte representation of the string, and it defaults to the ANSI charset.
The ANSI charset is typically a 1-byte per character encoding, meaning that it is not capable of representing more than 256 characters, and is restricted to the local language. In your case above, the EncryptStringENC method is (internally) trying to represent "Հայաստանի Հա" in the ANSI character encoding, and since these characters have no representation in that encoding, the method fails and returns NULL. (See the LastErrorText for some clues about what happened.)
The solution is to set the Chilkat.Crypt2.Charset property = "utf-8". The EncryptStringENC method will instead use the utf-8 representation, which can produce a byte representation for any character in any language.
answered Jul 19 '12 at 10:55