Archived Forum Post

Index of archived forum posts

Question:

XML Email Attachment shows ??? Chars?

Nov 10 '16 at 09:13

We are using following code to upload attachment to Email object and send email..

Email objEmail = new Email();
objEmail.Subject = SubjectFromDB;
objEmail.SetHtmlBody("<html><body>" + Final_Body + "</body></html>");
objEmail.AddTo("", s);
objEmail.AddCC("", s);
objEmail.AddBcc("", s);
objEmail.AddHeaderField("X-Priority", "1");
objEmail.AddFileAttachment(FilePath_Physical);

In some files we are getting ??? at the beginning of the file it it is XML. And some characters like ø are converted to ?? once receiver of the email receive the email.

For example: Original file before it is attached:

<?xml version="1.0" encoding="utf-8"?>
.
.

What receiver receives

???<?xml version="1.0" encoding="utf-8"?>

Answer

The three "???" chars are the utf-8 BOM (Byte Order Mark), also known as the utf-8 preamble. See https://en.wikipedia.org/wiki/Byte_order_mark

When Chilkat adds a file attachment, it must not modify the bytes of the file that is attached. If your XML file contains a BOM, then the attached file will also contain the BOM. There are a few solutions:

  1. However the XML file is produced, fix that process so that it produces an XML file that does not include the BOM.
  2. Use the Chilkat StringBuilder class to load the XML file into the StringBuilder, then add the attachment by calling email.AddStringAttachment2(path,sbXml.GetAsString(),"no-bom-utf8")

The issue with the "ø" character is likely caused by a mismatch between the charset actually used by the XML file, and the charset indicated by the line "<?xml version="1.0" encoding="utf-8"?>"

For example, let's say you use Notepad to create the XML file, you type in the XML text which includes the "ø", and then you save it. You've just created the error. If you saved the file to the default ANSI encoding (Windows-1252 / iso-8859-1), then the "ø" character is represented as a single byte. It is NOT in the utf-8 representation, however, the XML itself is telling the XML reader that the XML text is in utf-8. The XML reader is interpreting bytes as utf-8 char representations, and that is the problem -- the ø is NOT represented as utf-8 bytes.

Similar fixes for that are possible:

  1. However the XML file is produced, fix that process so that the XML file is produced using utf-8.
  2. If using StringBuilder, explicitly specify "Windows-1252" when loading the XML into the StringBuilder. Also, use an editor such as EmEditor to examine text files in different character encodings to see what you really have..