Jump to content

Avoiding timeout error?


Samuz

Recommended Posts

As the title says, I have a .txt file with about 30,000+ lines of data presented in a pipe delimited list, which i'm parsing and inserting into my database.

 

Problem is, my server seems to always time out every time I try to parse the whole file at once. I'm sure naturally it would work with out any timeout errors, but i'm ensuring the data is xss clean before it's being inserted and i'm doing that on about 15 items on each line, which means i'm calling the xss clean function over 450,000 times in one execution.

 

So my friend suggested I break the files down maybe into each file having 5,000 lines of code, which would mean i'd generate about 6 files (if I had 30,000 lines of data).

 

I've managed to code a script that breaks the main file into several files.

 

Now what I want to do is pass each of those files to my parser method, but i'd like to do them one by one, rather than in one execution as I want to avoid the timeout error.

 

Any ideas?

Link to comment
Share on other sites

If the problem is a php script execution time limit, just set the time limit to a longer reasonable value - set_time_limit

 

If you are trying to optimize the execution of your script, breaking the file into parts won't directly help. Your code still has to eventually read through and process all the lines in all the files. If you want us to suggest ways of optimizing your script, you would need to post your script along with some sample data.

 

If you are executing a database query inside of a loop, it generally takes more time to for php to send the query statement to the database server and if using a prepared query, to send the actual data to the database server, then it takes for the query itself to run on the database server. It is generally most efficient to reduce the number of queries by forming a multi-value insert query and insert several 100 to several 1000 rows of data in each query (depending on how much data is in each row so as to not exceed the maximum length for one query.)

 

Edit: Are you sure the problem isn't a memory usage problem? Do you get any php errors when you set php's error_reporting to E_ALL and display_errors to ON?

 

Edit2: It is also more efficient to use php's array functions, such as array_walk, to perform the same operations on all the data within an array, rather than looping over each element in the array.

 

Link to comment
Share on other sites

First thank you for the replies guys and I apologize for the late reply, the forums didn't send me any notification PMs this time for some reason.

 

Anyway.

 

I'm using codeigniter, so for those who aren't too familiar i'll try my best to explain what any method does exactly.

 

My parser method looks like this:

 

private function _parse_aid($fname, $type) {

        $file_path = 'uploads/' . $type . '/temp/'; // path to file
        $handle = fopen($file_path . $fname, "r"); //open file

        if ($handle) :
            while (($buffer = fgets($handle, 1024)) !== false) :
                $column = explode("|", $buffer); // explode the .txt list into an array
                if (strtotime($column[14]) > (strtotime($column[14]) - 1296000)) : // 15 days
                    $final[] = array(
                        'send_NID' => xss_clean($column[0]),
                        'send_ruler' => xss_clean($column[1]),
                        'send_nation' => xss_clean($column[2]),
                        'send_alliance' => xss_clean($column[3]),
                        'rec_NID' => xss_clean($column[5]),
                        'rec_ruler' => xss_clean($column[6]),
                        'rec_nation' => xss_clean($column[7]),
                        'rec_alliance' => xss_clean($column[8]),
                        'status' => xss_clean($column[10]),
                        'money' => xss_clean($column[11]),
                        'tech' => xss_clean($column[12]),
                        'soldiers' => xss_clean($column[13]),
                        'date_sent' => xss_clean($column[14]),
                        'id' => xss_clean($column[16])
                    ); // save each line in it's own array.
                endif;
            endwhile;
            $this->db->replace_batch($type, $final); 
// use MYSQL REPLACE to update/insert values with an array (so it produces one query rather than 30,000)
// this method isn't actually apart of CI (although replace() is), I just copied it and modified it abit to look like insert_batch()
// It's definately not a problem as it works.
        /* echo '<pre>';
          print_r($final);
          echo '</pre>'; */
        endif;
        fclose($handle);
        rename($file_path . $fname, $file_path . $this->file_name . '.txt'); // rename the file to something else
    }

 

The script worked nice and fine before, although it did take a while to actually parse everything (about 100 seconds). So yeah, just looking for an idea to optimize this code.

 

@PFMaBiSmAd - I even thought about that, but i'm sure I read somewhere that the time limit in php.ini overrides it if set? That's a problem, because I don't actually have access to php.ini (shared server). Correct me if i'm wrong.

Also no error logs are being generated in any of my log files, while error_reporting() is set to E_ALL. the only error I get is something along the lines of this:

 

Internal Server Error

 

The server encountered an internal error or misconfiguration and was unable to complete your request.

 

Please contact the server administrator, webmaster@lyricalz.com and inform them of the time the error occurred, and anything you might have done that may have caused the error.

 

More information about this error may be available in the server error log.

 

Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

 

it's pretty vague, imo.

@darkfreaks, thanks for that. But I don't think that situation will help? I'm not sure. Because i'm using one table for this data, all the data has their unique id so i'd like to either update or insert.

Link to comment
Share on other sites

@PFMaBiSmAd - I even thought about that, but i'm sure I read somewhere that the time limit in php.ini overrides it if set? That's a problem, because I don't actually have access to php.ini (shared server). Correct me if i'm wrong.

 

No, the time you give using set_time_limit will override any php.ini value that has been set.  One thing I will do for parsers and such is come up with an approximate time-per-record to parse and then multiple that by number of records that need processed.  If it's not really possible to determine the number of records then just set it to a high value like 999.  Note that setting it to 0 will remove any time limit allowing the script to run forever, but you should avoid that in case you end up with an error causing an infinite loop.

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.