stublackett Posted December 20, 2010 Share Posted December 20, 2010 Hi All, I'm working on an X-Cart site. Looking through the pages, any page with an apostrophe in the content is being loaded as an � instead. The sites' char set is UTF-8. Changing it to ISO then creates an issue in itself with bulleted lists. Just wondering if anyone has had the problem in the past and if they have solved it? Any help appreciated. Cheers Quote Link to comment Share on other sites More sharing options...
johnny86 Posted December 20, 2010 Share Posted December 20, 2010 Charset issues are always a bit hard. But the best thing you can do to make sure your text is showing up correctly is to use the same charset allover your program. This means: - Save all your HTML, PHP, XML, CSS files etc in UTF-8 format (if utf-8 is the desired charset) - Make sure you have default charset in PHP / Apache / MySQL set to UTF-8 - Send appropriate headers always and keep meta info in your source about charset UTF-8 Make sure all data you get is in UTF-8 format. This is the most trickiest part for me. There is no way of knowing that a client will send you UTF-8 data even if you have done all correctly on your side. There will always be missbehaving browsers/clients that simply wants to harm you or doesn't work according to standards. There isn't much to do about that either. Since it's almost impossible to identify the charset of an incoming data. Except for just testing one by one. You could check PHPs iconv functions for some help. There are some user made functions in the comments too that can identify UTF-8 formatted string for example. All in all those doesn't get you too far. Best thing to do is make sure you use overall same charset in everything in your project. With iconv you could change all of the contents to UTF-8. And utf8_encode() & utf8_decode() functions will encode ISO-8859-1 to UTF-8 and other way around. Try using those functions on some strings that you know are ISO-8859-1 or UTF-8. You always need to know which charset the string is in order to convert it to another charset. Also: http://fi.php.net/manual/en/book.iconv.php Hope this helps. And I'd like to hear other opinions too. Because charset issues are annoying. Ecspecially if I don't know what charset the data I'm receiving is encoded. And the great thing in making sure you have all data encoded in UTF-8 is that whenever you print data to your page. htmlspecialchars() is enough sanitizing. So all you need to do afther that is just run all your "untrusted" data through that function if you know you have utf8 all over. =) Offcourse you might want to sanitize something else too. But that will be enough to not braking your site or anything. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.