UTF-8 encoded html pages show � (questions marks) instead of characters
I have the standard XAMPP installation on win7 (x64). Having had my share of encoding troubles in a past project where mysql encoding did not match with the php enconding which in turn sometimes output html in other encodings, I decided to consistently encode everything using utf-8.
I'm just getting started with the html markup and am allready experiencing troubles.
- My page is saved using utf-8 (no BOM, I think) //update: It turns out this was NOT the case. The file was actually saved with ISO_8859-1. I later found this out thanks to Sherm Pendleys answer. I had to go back and change my project settings (which were set to "ISO-8859-1") to the desired "UTF-8".
- php is set per .htaccess to serve .php-pages in utf-8 with: AddCharset UTF-8 .php
- html has a meta tag specifying: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
- To test I set used php header('Content-Type:text/html; charset=UTF-8');
The page is evidently served in utf-8 (firefox and chrome recognize it as such) but any special characters such as é, á or ¡ will just show as �. Also when viewing the source code.
When dropping the encoding settings mentioned above all characters are rendered correctly but the encoding that is detected shows either windows-1252 or ISO-8859-1 depending on the browser.
How come? I'm very puzzled. I would have expected the exact opposite behavior. Any advice is welcome, thanks!
edit: Hopefully this helps a bit more. This is the response header (as per firebug)
HTTP/1.1 200 OK Date: Sat, 26 Mar 2011 20:49:44 GMT Server: Apache/2.2.14 (Win32) DAV/2 mod_ssl/2.2.14 OpenSSL/0.9.8l mod_autoindex_color PHP/5.3.1 mod_apreq2-20090110/2.7.1 mod_perl/2.0.4 Perl/v5.10.1 X-Powered-By: PHP/5.3.1 Content-Length: 91 Keep-Alive: timeout=5, max=99 Connection: Keep-Alive Content-Type: text/html; charset=utf-8
When [dropping] the encoding settings mentioned above all characters [are rendered] correctly but the encoding that is detected shows either windows-1252 or ISO-8859-1 depending on the browser.
Then that's what you're really sending. None of the encoding settings in your bullet list will actually modify your output in any way; all they do is tell the browser what encoding to assume when interpreting what you send. That's why you're getting those �s - you're telling the browser that what you're sending is UTF-8, but it's really ISO-8859-1.
In my case, database returned latin1, when my browser expected utf8.
So for MySQLi I did:
See http://php.net/manual/en/mysqli.set-charset.php for more info
Check if any of your .php files which printing some text, also is correctly encoding in utf-8.
Tell PDO your charset initially.... something like
PDO("mysql:host=$host;dbname=$DB_name;charset=utf8;", $username, $password);
Notice the: charset=utf8; part.
hope it helps!
I'm from Brazil and I create my data bases using latin1_spanish_ci. For the html and everything else I use:
The data goes right with é,ã and ç... Sometimes I have to put the texts of the html using the code of it, such as:
You can find the codes in this page: http://www.ascii.cl/htmlcodes.htm
Hope this helps. I remember it was REALLY annoying.
Looks like nobody mentioned
SET NAMES utf8;
I found this solution here and it helped me. How to apply it:
To be all UTF-8, issue the following statement just after you’ve made the connection to the database server: SET NAMES utf8;
Maybe this will help someone.
The problem is the charset that is being used by apache to serve the pages. I work with Linux, so I don't know anything about XAMPP. I had the same problem too, what I did to solve the problem was to add the charset to the charset config file (It is commented by default).
In my case I have it in /etc/apache2/conf.d/charset but, since you're using Windows the location is different. So I'm giving you this like an idea of how to solve it.
At the end, my charset config file is like this:
# Read the documentation before enabling AddDefaultCharset. # In general, it is only a good idea if you know that all your files # have this encoding. It will override any encoding given in the files # in meta http-equiv or xml encoding tags. AddDefaultCharset UTF-8
I hope it helps.