Jump to content

Need help reading a .gz file line by line


affordit

Recommended Posts

I have a script that reads a .gz file into an array and prints the name of each record but will not work on larger files. Is there a way to read 1 line at a time?

Here is the code I have so far.

<?php

if ($handle = opendir('.')) {

print "<ol>";
    
    while (false !== ($file = readdir($handle))) {
	if($file != '..' && $file!="." && $file!="start_update.php" && $file!="sharons_dbinfo.inc.php" && $file!="root.php" && $file!="read_directory.php" && $file!="read_dir.php" && $file!="new_category.php" && $file!="index.php" && $file!="file_count.php" && $file!="dir_loop2.php" && $file!="dir_loop1.php" && $file!=".htaccess" && $file!="Answer.txt" && $file!="Crucial_Technology-Crucial_US_Product_Catalog_Data_Feed.txt"){
$filename = $file;
$go = filesize($filename);
		if($go >= 1){
		$filename2 = explode("-", $filename);
		$filename2 = $filename2[0];
		echo str_replace("_"," ",$filename2) . ' | Filesize is: ' . filesize($filename) . ' bytes<br>';
		$gz = gzopen($filename, 'r');
$lines = gzfile($filename,10000);
foreach ($lines as $line) {
$line2 = explode(",", $line);
$line2 = str_replace("," , "-" , $line2);
    echo "<li>".str_replace("," , "-" , $line2[4])."</li><br>";
}

		}
	}
    }

    closedir($handle);
}
?>
</ol>

Link to comment
Share on other sites

That works great, but not printing the whole name in some records.

This line

1GB kit (512MBx2) Upgrade for a Dell OptiPlex 745 Series (Desktop, Mini-Tower, and Small Form Factor) System

Is printing this

"1GB kit (512MBx2) Upgrade for a Dell OptiPlex 745 Series (Desktop

Any idea why? :shrug:

Link to comment
Share on other sites

This is the first line that would not print right there are 40 columns in here...

Crucial Technology,http://www.crucial.com/index.asp,Crucial US Product Catalog-Data Feed,

12/06/2010,"1GB kit (512MBx2) Upgrade for a Dell OptiPlex 745 Series (Desktop, Mini-Tower, and Small Form Factor) System",MEMORY MODULE,"1GB kit (512MBx2), 240-pin DIMM, DDR2 PC2-5300, NON-ECC,",CT613060,Crucial,,,,USD,,37.99,,,http://www.kqzyfj.com/click-4349884-10273954?url=http%3A%2F%2Fwww.crucial.com%2Fstore%2Faffiliateredirect.asp%3Fmtbpoid%3DC2FFE7ABA5CA7304%26aid%3D10273954%26cid%3D777292%26subid%3D890%26PRS%3Duscj,http://www.tqlkg.com/image-4349884-10273954,http://images.crucial.com/images/resources/small/package/240-pinDIMMkit_2.gif,Memory > DDR2 PC2-5300,,,,,,,,,,,Free shipping for a limited time on qualified orders,,,,,Yes,New,Limited lifetime warranty,

Link to comment
Share on other sites

Ok that's somewhat tricky but not impossible.  Some fields are quoted and others aren't, which makes it not as easy as it could be.  The ideal solution is to replace explode() with something which will recognize and honour the quoted fields, so it doesn't split on the commas within those fields.

 

OR, if you only need field 4 and none of the others, you could make a regexp to capture just that field.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.