Jump to content

Import text files in to SQL - Character conversion needed?


slowfib

Recommended Posts

I'm importing some text files from my windows servers in to a SQL database, but I'm running in to what I think is some sort of issue with either a special character or something to do with UTF-8, or UTF-16... I haven't dealt with this before, so I'm really not even sure.

 

I read in the file as such:

$handler = fopen($file, "r");
$Data = fread($handler, filesize($file));
fclose($handler);

 

The text file itself contains the data formatted like this:

 

Unable to deliver this message because the follow error was encountered: "This message is a delivery status notification that cannot be delivered.".

The specific error code was 0xC00402C7.

 

The message sender was <>.

 

The message was intended for the following recipients.

OnlineHelp@somedomain.com

 

If I simply echo out this data:

echo $Data

 

It'll come out in Firefox like the following. And I can see FF is choosing to view it as Western (ISO-8859-1), but if I choose to use Unicode (UTF-16) then the data displays correctly.

U�n�a�b�l�e� �t�o� �d�e�l�i�v�e�r� �t�h�i�s� �m�e�s�s�a�g�e� �b�e�c�a�u�s�e� �t�h�e� �f�o�l�l�o�w� �e�r�r�o�r� �w�a�s� �e�n�c�o�u�n�t�e�r�e�d�:� �"�T�h�i�s� �m�e�s�s�a�g�e� �i�s� �a� �d�e�l�i�v�e�r�y� �s�t�a�t�u�s� �n�o�t�i�f�i�c�a�t�i�o�n� �t�h�a�t� �c�a�n�n�o�t� �b�e� �d�e�l�i�v�e�r�e�d�.�"�.� � � � �T�h�e� �s�p�e�c�i�f�i�c� �e�r�r�o�r� �c�o�d�e� �w�a�s� �0�x�C�0�0�4�0�2�C�7�.� � � � � � �T�h�e� �m�e�s�s�a�g�e� �s�e�n�d�e�r� �w�a�s� �<�>�.� � � � � � �T�h�e� �m�e�s�s�a�g�e� �w�a�s� �i�n�t�e�n�d�e�d� �f�o�r� �t�h�e� �f�o�l�l�o�w�i�n�g� �r�e�c�i�p�i�e�n�t�s�.� � � �O�n�l�i�n�e�H�e�l�p�@�E�l�i�t�e�R�a�c�i�n�g�.�c�o�m� � �

 

And finally, if I try to insert the data in to SQL, the data looks like this:

 

INSERT INTO Badmail VALUES(

2

, '00360053425643112201000000004.BDR'

, 'U n a b l e  t o  d e l i v e r  t h i s  m e s s a g e  b e c a u s e  t h e  f o l l o w  e r r o r  w a s  e n c o u n t e r e d :  " T h i s  m e s s a g e  i s  a  d e l i v e r y  s t a t u s  n o t i f i c a t i o n  t h a t  c a n n o t  b e  d e l i v e r e d . " .

 

 

 

T h e  s p e c i f i c  e r r o r  c o d e  w a s  0 x C 0 0 4 0 2 C 7 .

 

 

 

 

 

T h e  m e s s a g e  s e n d e r  w a s  < > .

 

 

 

 

 

T h e  m e s s a g e  w a s  i n t e n d e d  f o r  t h e  f o l l o w i n g  r e c i p i e n t s .

 

O n l i n e H e l p @ S o m e d o m a i n . c o m

 

'

, '2010-11-30 07:17:05'

, GETDATE())

 

 

So basically, I'm not really sure if I'm supposed to convert the ASCII to UTF-8 or if I just need to do a bunch of str_replace to correct the data before inserting.

I'd appreciate any feedback or suggestions anyone has.

 

Thank you.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.