Jump to content

utf-8 ascii encoding errors


mavera2

Recommended Posts

after some parsing of an outer site,

i set the variable

$word1 to the obtained string.

 

when i make

$word1Encod=mb_detect_encoding($word1);

it gives ASCII.

 

i have a string like:

$mycomment="başlamış bugün"

then

$mycommentEncod=mb_detect_encoding($mycomment);

it gives UTF-8.

 

when i join them as

$joined= $word1 . $mycomment;

it gives UTF-8.

 

But the final string has characters like

ü

ı

which are turkish characters, which are found in variable $word1.

 

Although UTF-8 includes ASCII characters, I tried

mb_convert_encoding($word1,"UTF-8","ASCII");

But still it didn't help.

 

My php file's encoding is UTF-8.

I tried UTF-8 without BOM, as well.

 

Do you have any recommendation?

Thank you

 

Link to comment
Share on other sites

You can't convert Turkish characters into ASCII encoding.

 

How are you generating the XML? I suspect the problem is that whatever you use is turning the UTF-8 octets into numeric HTML entities. Probably because it doesn't know better: wrong encoding on the feed, not smart enough...

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.