Archived Forum Post

Index of archived forum posts

Question:

Read email in Python for different languages (Polish, Greek, Russian)?

Aug 02 '12 at 10:50

I am evaluating Chilkat email component for python. I need to read messages (via POP3) in different languages ​​(Polish, Greek, Russian), and can not find a clear solution, can you help me with an example or instruction?


Answer

In Python, a string is a null-terminated sequence of bytes that represent characters in some character encoding (such as utf-8). By default, any Chilkat method that returns a string will return the string using the ANSI character encoding. The ANSI character encoding may change based on the locale of the computer where your Python script is running. For example, in the USA or Western Europe, the ANSI encoding is typically Latin-1 (or iso-8859-1).

With a few exceptions, all Chilkat classes have a "Utf8" property that can be set to True. This will cause the object instance to return all strings as utf-8 instead of ANSI. This is what you would need to do if the strings (such as the subject of an email) might be in any language.

For example: emailObject.put_Utf8(True)

You need not be concerned about the character encoding used within the email itself. It doesn't matter how the strings within the MIME of the email are represented w.r.t. character encoding or transfer encoding (Base64, quoted-printable, etc.) because Chilkat will automatically convert/decode and return the string as utf-8 or ANSI (depending on the Utf8 property setting).

The CkString class is a little special in that many of the method explicitly state the character encoding of the string that is returned. In these cases, the Utf8 property does not apply. (Such as with the CkString.getUtf8() method, which always returns the utf-8 representation of the string contained in the CkString object instance.

For more information about Python w/ character encodings, see these Chilkat examples:
Python: Defining Source Code Encoding
Python: Working with Strings in any Charset