Jump to content

minor changes & tailoring of a regex - that allready works very fine


dilbertone

Recommended Posts

 

Good day dear community.

 

I need to build a function which parses the domain from a url. I have used various ways to parse html sources. But this one is is a bit tricky! See the target i want to parse - it has some invaild Markup:

 

http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=644.0013008534253&SchulAdresseMapDO=194190

 

well what do you think - can i apply this code here

 

<?php
require_once('config.php'); // call config.php for db connection
$filename = "url.txt"; // Include the txt file which have urls
$each_line = file($filename);
foreach($each_line as $line_num => $line)
{
    $line = trim($line);
    $content = file_get_contents($line);
    //echo ($content)."<br>";
    $pattern = '/<td>(.*?)<\/td>/si';
    preg_match_all($pattern,$content,$matches);

    foreach ($matches[1] as $match) {
        $match = strip_tags($match);
        $match = trim($match);
        //var_dump($match);
        $sql = mysqli_query("insert into tablename(contents) values ('$match')");
        //echo $match;
    }
}
?>

 

well i have to rework the parser-part of this script. I need to parse somway different - since i have other site here.

 

Can anybody help me  here to get a better regex - or a better way to parse this site ...

 

Any and all help will be greatly apprecaited.

 

regards

db1

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.