Jump to content

create html parser loop through


dflow

Recommended Posts

how should i approach the following:

a page with a products list+link to product page

 

i want to build a crawler that loops through all the products in the list and goes to the product page and

and parses the product page.

 

need help with the loop

 

Link to comment
Share on other sites

What do you mean? You need an array of products and product links which you crawl using whatever method. You then use a foreach() loop to loop through each product, then use the product link to parse the product link page.

 

$products = array('linkpage1.html','linkpage2.html');

 

foreach($products as $product){

 

  parse($product);

 

}

Link to comment
Share on other sites

ok got you

now how can i explode a link structured"product/product_1.htm"

from the array created?

i got all the links on the page and want only the specific ones

 

for example:

foreach($html->find('a') as $e) {
    echo $arraylinks[] = $e->href . '<br>';

}
$linkChunks = explode("product/", $apartmentpage_linkr);


Link to comment
Share on other sites

As a test:

 

foreach ($arraylinks as $link) {
   $category = basename(dirname($link));
   $page = basename($link);
   
   if ($category == "apartments") {
      echo  $page.'<br />';
   }
}

 

works thanks

what was the problem before?

 

i got the results but with that error

Link to comment
Share on other sites

As a test:

 

foreach ($arraylinks as $link) {
   $category = basename(dirname($link));
   $page = basename($link);
   
   if ($category == "apartments") {
      echo  $page.'<br />';
   }
}

 

actually now ill need the results as an array and to loop through each link

Link to comment
Share on other sites

Something like this will give all the results in an array:

 

foreach ($arraylinks as $link) {
   $category = basename(dirname($link));
   $page = basename($link);
   
   $links[$category][] = $page;
}

Then you can do something like this:

 

foreach($links['apartments'] as $page) {
   echo $page;
}

or:

 

foreach($links as $category => $page) {
   echo $category . ': ' . $page;
}

 

Link to comment
Share on other sites

ok

im getting the links but i have 3 results of each how can i limit it to 1 result per link

 

now im trying to put things together and making a mess

i want to loop through each link and get the html contents parsed

 

 

<?php
// example of how to use basic selector to retrieve HTML contents
include('../simple_html_dom.php');

// get DOM from URL or file
$html = file_get_html('http://www.example.com/ViewAllApartments.aspx');




   
   



// find all links
foreach($html->find('a') as $e) {
     $arraylinks[] = $e->href . '<br>';

}






foreach ($arraylinks as $link) {
   $category = basename(dirname($link));
   $page = basename($link);
   
   if ($category == "apartments") 
{
{
   $url="http://www.example.com/apartments/";
      echo  $page.'<br />';
  echo  $url.$page.'<br />';
   }
}

foreach($links['apartments'] as $page) {
   $phtml = file_get_html($url.$page);


foreach($phtml->find('span[id=apartmentname]') as $apartmentname)
    echo $apartmentname->plaintext.'<br><br>';
}




?>

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.