Jump to content

HELP to parse html


odine

Recommended Posts

OK.. pretty new to PHP for the most part, but i understand programming languages to a decent extent! Anyways im trying to parse an HTML page to get data out of it and probably in turn put into an sql table.. all i need help with is doing the parsing with dom, and xpath querys or however would be the best way to do this...

 

page im trying to parse: http://us.battle.net/wow/en/guild/Moonrunner/The%20Eternal%20Blade/news

 

basically the data i want to put into sql or variables for the time being would be the 25 results returned in news. (first one is mudkips item Vicious Gladiator's Signet of Cruelty.

, and last item is: Lionus earned the achievement Level 30 for 10 points. ) Can anyone please give me some help with a function that could do this? please! :)

Link to comment
Share on other sites

OK. so below is code and below that is some explanation...

 

        $news = file_get_contents("http://us.battle.net/wow/en/guild/Moonrunner/The%20Eternal%20Blade/news");

        $dom = new domDocument;
        @$dom->loadHTML($news);
        $dom->preserveWhiteSpace = false;

$xpath = new DOMXPath( $dom );

$aName = $xpath->query('/html/body/div/div[2]/div/div[2]/div/div[2]/div[2]/div/div/ul/li/dl/dd/a');


foreach ( $aName as $data ) {		
	echo $data->nodeValue . "\n";		
}

 

OK.. that is printing all the data i want with some blank space (believe its actually attributes/html). I wanted to be able to make a few varialbes out of the data im parsing. one would be $name, another $value, and another $text. Was hoping to do that within a similar foreach loop like the one above! So far im not sure how to do that though (new to php). If i change $aName varialbe to.... $aName = $xpath->query('/html/body/div/div[2]/div/div[2]/div/div[2]/div[2]/div/div/ul/li/dl/dd/a[1]'); that would give me my $name varialbe, but to get $value i would have to make another $aName variable with a[2] at the end.. is there an easier way i can do this in a loop? Also last but not least the $text variable that i want to make is: 'purchased item' which would come out of....

<a href="/wow/en/character/moonrunner/mudkips/">Mudkips</a> purchased item <a href="/wow/en/item/60410" class="color-q4">Vicious Gladiator&#39;s Dreadplate Helm</a>.

 

I hope someoen understands me and can help me out, thanks in advance!

Link to comment
Share on other sites

welp i was hoping someone could help me out! :( Below is the code i've been playing around with trying to figure this all out.. its outputting the proper things that i want now, BUT i need to be able to take the children (in a loop or something?) and put them into variables/array so that i can eventually get them to my sql db (not even worried about that i know what to do there). again i need the characters name in a variable, time achivement occured, objective (obtained item, or obtained achivement), in variables can someone help me with how i can call those children of the main xpath so i can get them into variables? hope this makes sense again!!!

 

Well i was really hoping someone could help me out! :(

 

Again what im trying to do is get ALL THE CHILDREN of '/html/body/div/div[2]/div/div[2]/div/div[2]/div[2]/div/div/ul/*' so that i can sort that data into variables. but i have no clue how to get the children of basically every single news item listed on: http://us.battle.net/wow/en/guild/Moonrunner/The%20Eternal%20Blade/news

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.