etrader Posted February 20, 2011 Share Posted February 20, 2011 It is easy to get image or link by DomDocument, but I did not find a way to get image with its target link. Imagine a html as <div class=image> <a href='http://site.com'><img src='imagelink.jpg'></a> </div> How to get both the image link and href? $dom = new DOMDocument(); @$dom->loadHTML($html); $xpath = new DOMXPath($dom); $hrefs = $xpath->evaluate("/html/body//div[@class='image']"); for ($i = 0; $i < $hrefs->length; $i++) { $href = $hrefs->item($i); Now to get the image and its href, we need first getElementsByTagName('a') and getElementsByTagName('img') but they do not work inside foreach. What's your idea? Quote Link to comment Share on other sites More sharing options...
silkfire Posted February 20, 2011 Share Posted February 20, 2011 If I were you I would refrain from using DOMDocument it's very unreliable. Please consider using regex instead which is very precise and powerful for parsing documents for example. This piece of code will return the URLs of all the links in the document that surround an image. preg_match_all('#class="image".*?a href='([^']+).*?img src#s', $html, $imglinks); foreach ($imglinks[1] as $imglink) echo $imglink; Quote Link to comment Share on other sites More sharing options...
etrader Posted February 20, 2011 Author Share Posted February 20, 2011 Nice idea, but I got an error PHP Parse error: syntax error, unexpected '(' in the preg_match_all line Quote Link to comment Share on other sites More sharing options...
BlueSkyIS Posted February 20, 2011 Share Posted February 20, 2011 need to escape the single quotes: preg_match_all('#class="image".*?a href=\'([^\']+).*?img src#s', $html, $imglinks); Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.