Why is IE failing to show UTF-8 encoded text?

I have a some Chinese characters that I'm trying to display on a Kentico-powered website. This text is copy/pasted into Kenticos FCK editor, and is then saved and appears on the site. In Firefox, Chrome, and Safari, the characters appear exactly as expected. In IE 8 Standards mode, I see only boxes.

The text is UTF-8 encoded, and as far as I can tell, it is encoded correctly in the response from the server. There is a Content-Type: text/html; charset=utf-8 response header, and a <meta http-equiv="content-type" content="text/html; charset=UTF-8" /> meta tag on the page too. When I download the HTML from the server and compare the bytes of the characters in question to the original UTF-8 text document, the bytes all match, except the HTML does not include a BOM.

This seems to be specific to IE 8 in Standards mode. In IE 8 Quriks: it works. IE 7 Standards: it works. IE 7 Quirks: Works. I'm not sure how standards mode would cause this problem.

Strangely, if I view-source from IE, the characters show up in the source view correctly.

Any suggestions on what might be wrong here? Am I missing something obvious?


I can't explain this in detail. But this is indeed a known problem.

Here's a small reproducible code snippet:

<!DOCTYPE html>
<html lang="en">
    <body><p>&#65185;<br>0 0</p></body>

Save it in UTF-8 and view in IE8. You see nothing. Replace 0 0 by 00 and reload the page. It'll work fine! This is absolutely astonishing. Weirdly, replacing 0 0 by a a or the <br> by a </p><p> will fix it as well. It'll have something to do with failures in whitespace rendering.

Sorry, I don't have authorative resources proving this, but this is just another evidence IE8 isn't as good as we expect it is. Your best bet is to try to change the HTML and/or build it step by step so that it works at some point or when in vain, add the following meta tag to the head to force IE8 into IE7 mode:

<meta http-equiv="X-UA-Compatible" content="IE=7" />

The default IE encoding is Western European (ISO) so you need to change it manually to UTF-8 or enforce IE to use a given encoding like this:

  • HTML 4.01

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  • HTML 5

    <meta charset="UTF-8">

And you also need to use lang attribute in <html> tag to declare language

    <html lang="zh">

for Chinese

Just a wild guess, but it might be a font issue. Maybe the fonts available to your browser can' represent said Chinese characters.

I managed to fix the same issue by changing the file's UTF format to "UTF8 With Byte Order Mark".

(The editor I use allows me to switch file formats easily, not sure how to proceed otherwise, but worth taking a look at the different UTF file formats, IE(8) simply doesn't like UTF8 Without Byte Order Marks...)

I was also able to reproduce the snippet from the answer above;

<!DOCTYPE html>
<html lang="en">
    <body><p>&#65185;<br>0 0</p></body>

But my results were "intermittent" while in UTF-Without BOM (sometimes accents would show up, some other times the weird chars, and it didn't look like a whitespace rendering issue to me...) Note that I was fiddling with lang="fr" and lang="es", but in all cases, changing the UTF file format seems to have permanently resolved my accents display issues. :)

I'm not 100% familiar with UTF, but if the chars are coded using 2 bytes, one would have to assume that white-space issues and misunderstood chars could be related to misaligned bytes in the sources.

This may be the same kind of thing that caused Rails 3 to add a snowman character to their output: What is the _snowman param in Ruby on Rails 3 forms for?

Need Your Help

What are "n+k patterns" and why are they banned from Haskell 2010?

haskell functional-programming

When reading Wikipedia's entry on Haskell 2010 I stumbled across this:


objective-c extern

I would like to ask what's the reason behind using FOUNDATION_EXPORT instead of extern in Objective C projects.