login about faq


I try to convert this simple html file. It is just a centered bullet symbol. I use the CkHtmlToXmlW class for that purpose.

When using ConvertFile method, I get the "Replacement Character" instead of the bullet. When I use ToXml method, I get the quotation mark instead.

What should I do to convert the bullet right?

I use VS2010 with x86.

Thanks in advance.

asked Mar 03 '15 at 03:33

odavidi's gravatar image


The HTML file contains this:


There needs to be a META tag indicating the utf-8 charset because the bullet char, if you examine the HTML in a hex editor, is composed of 3 bytes in the utf-8 encoding. By not specifying any charset in an HTML meta, the default choice is ANSI and therefore the bytes that compose the bullet are interpreted according to the 1-byte-per-char ANSI encoding of whatever computer you happen to be running on...


answered Mar 03 '15 at 12:50

chilkat's gravatar image

chilkat ♦♦

I understand.

Earlier, I wanted to make things simpler, so I built this file myself. The real file I try to convert is this. It has a uft-16 encoding. I need the output XML in utf-8 encoding, but I can't find any way to do it with ConvertFile. I tried to put put_XmlCharset(L"utf-8"), but it still doesn't work.

Thanks in advance.

(Mar 08 '15 at 04:24) odavidi
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: Mar 03 '15 at 03:33

Seen: 2,813 times

Last updated: Mar 08 '15 at 04:24

powered by OSQA