Jump to content

Getting data from external website (a wiki) into my website


FlameDra

Recommended Posts

I'm trying to make a League of Legends (a video game) community website, both as a personal project and for practice.

Now the game has a lot of champions, each of whom have 5 unique abilities. Now, I thought about manually inputting all the details about each champion into a MySQL database, but that would long and tedious, and I don't really have the time for it now. Also, the game patches very oftern (like, once every 2 weeks) which changes many of the stats, etc. of the champion, and it is not possible for me to keep manually updating these every time there is a patch.

 

Fortunately, there is a League of Legends Wiki which has all the data I need in their specific champion pages, which they keep updated per patch. So I was wondering if there was any way to get the data from the divs in the wiki, and have it display on my site. What I want to do in my website is that whenever someone types a champion's name (in a post or whatever), I want it to display a hover-over dialog with some of the champions details. And a lot of other features such as that.

 

In plain English I need a way to :

> Tell PHP to go to the wiki's source code on a specific page

> Find a specific div container

> Get X data from there

> Pass X data into a function to display the hover-over

 

I think this way, I would not have to maintain a database as I can leech off the wiki's data.

I have not coded anything like this before, so I would like a few pointers as to how to achieve this. Any help will be appreciated! :D

Link to comment
Share on other sites

If you look at the very bottom of that page you will see a link to the API documentation. You should read that.

Thanks for pointing it out, I completely missed that!

 

I got to the following pages :

http://leagueoflegends.wikia.com/api.php

http://leagueoflegends.wikia.com/wiki/Special:ApiExplorer

 

I'll be reading around and see if I can find something to help me get data from the wiki :D

Link to comment
Share on other sites

You shouldn't leech the content either, as it will slow down your page loads.

 

Have a CRON execute once every two weeks (when updates happen) that grabs all of the information and updates your local database. You may want to perform this daily, instead, if an update is a day late or they're slow to update the wiki.

Link to comment
Share on other sites

You shouldn't leech the content either, as it will slow down your page loads.

 

Have a CRON execute once every two weeks (when updates happen) that grabs all of the information and updates your local database. You may want to perform this daily, instead, if an update is a day late or they're slow to update the wiki.

I thought of many scenarios of gathering data, one of them was to keep a local database which gets the data from the Wiki (via a script or w/e) and writes them to the database. But I have no idea how to achieve that, could you link me to some examples or tutorials?

Link to comment
Share on other sites

Okay so I've been poking around the Wikia API and this is what I've come up with so far.

 

For this example, I'm attempting to get all the data on Caitlyn, the Sheriff of Piltover. After digging around the documentation, I found out that in order to get all the data on that page in XML format I have to run the following query :

 

http://leagueoflegends.wikia.com/api.php?action=query&titles=Caitlyn_the_Sheriff_of_Piltover&prop=revisions&rvprop=content&format=xml

 

Below, is the result that I get :

<api><query><normalized><n from="Caitlyn_the_Sheriff_of_Piltover" to="Caitlyn the Sheriff of Piltover"/></normalized><pages><page pageid="117773" ns="0" title="Caitlyn the Sheriff of Piltover"><revisions><rev xml:space="preserve">{{C-top}}{{infobox champion
| name        = Caitlyn
| image       = File:CaitlynSquare.png
| title       = The Sheriff of Piltover
| herotype    = {{Attributes|Ranged|Carry}}
| date        = January 4, 2011<ref>[[V1.0.0.108]]</ref>
| health      = 40
| attack      = 80
| spells      = 40
| difficulty  = 40
| hp          = 390 (+80)
| mana        = 255 (+35)
| damage      = 47 (+3)
| range       = 650
| armor       = 13 (+3.5)
| magicresist = 30 (+0)
| attackspeed = 0.668 (+3.0%)
| healthregen = 4.75 (+0.55)
| manaregen   = 6.5 (+0.55)
| speed       = 300
| IP          = 6300
| RP          = 975

}}'''Caitlyn, the Sheriff of Piltover''' is a [[champion]] in [[League of Legends]].<ref>[http://www.leagueoflegends.com/champions/51/caitlyn_the_sheriff_of_piltover Caitlyn's profile page] at [[leagueoflegends.com]]</ref>
Last time on Rotation: May 8th, 2012
==Abilities==
{{Abilities
|ver         = v1.0.0.129
|innatename  = Headshot
|innateinfo  = '''(Innate)''': Every {{pp|3|8th|7th|6th|1|7|13}} auto-attack is enhanced to be a headshot, dealing 150% damage to a champion or 250% damage to a minion or monster. Attacks from brush increase the attack counter by two instead of one.

|firstname   = Piltover Peacemaker
|firstinfo   = '''(Active)''': Caitlyn revs up her rifle for 1 second to unleash a penetrating shot in a line which will deal physical damage to all targets hits. It will deal 15% less damage for each subsequent target hit, down to a minimum of 40% damage dealt.
*'''Range:''' 1300
|firstlevel  = {{level up|Cost|mana|50|60|70|80|90}}
{{level up|Cooldown|seconds|10|9|8|7|6}}
{{level up|Physical Damage|(+1.3 per attack damage)|20|60|100|140|180}}
{{level up|Minimum Physical Damage|(+0.52 per attack damage)|8|24|40|56|72}}

|secondname  = Yordle Snap Trap
|secondinfo  = '''(Active)''': Caitlyn sets a trap at the target nearby location. The trap triggers when a champion walks over it. This trap is visible to both allies and enemies. When sprung, the trap immobilizes the champion for 1.5 seconds, dealing magic damage over the duration and additionally revealing the target for 9 seconds. Caitlyn can set up to 3 traps and they last 4 minutes. When she sets a trap once the cap has been reached, the oldest trap will deactivate itself.
*'''Cost:''' 50 mana
*'''Placement Range:''' 800
*'''Activation Range:''' 135
|secondlevel = {{level up|Cooldown|seconds|20|17|14|11|8}}
{{level up|Total Magic Damage|(+0.6 per ability power)|80|130|180|230|280}}

|thirdname   = 90 Caliber Net
|thirdinfo   = '''(Active)''': Caitlyn fires a heavy net in front of her, knocking herself back in the opposite direction. The net will slow down the first target hit by 50% and will deal magic damage to it.
*'''Cost:''' 75 mana
*'''Range:''' 1000
*'''Knockback Distance:''' 400
|thirdlevel  = {{level up|Cooldown|seconds|18|16|14|12|10}}
{{level up|Magic Damage|(+0.8 per ability power)|80|130|180|230|280}}
{{level up|Slow Duration|seconds|1|1.25|1.5|1.75|2}}

|ultiname    = Ace in the Hole
|ultiinfo    = '''(Active)''': Caitlyn marks an enemy champion at a huge range and channels for 2 seconds to line up the perfect shot, providing vision of the target for the duration. She then fires the projectile to deal massive physical damage. Enemy champions can intercept the bullet for their ally.
*'''Cost:''' 100 mana
*'''Projectile Speed:''' 3200
|ultilevel   = {{level up|Cooldown|seconds|90|75|60}}
{{level up|Range||1900|2050|2200}}
{{level up|Physical Damage|(+2.0 per bonus attack damage)|250|475|700}}
}}


==References==
{{Reflist}}
{{C-bot}}
[[Category:2011 release]]
[[Category:Season One release]]
[[Category:Released Champion]]</rev></revisions></page></pages></query></api>

 

So what I want to do is to get the XML data (such as the above) on all of the champions, have them formatted, and enter them into my local database. I would like to do it in such a way so that I can do a CRON (as xyph mentioned) which I would run every 2 weeks  to overwrite all the data in my database with new ones.

 

Some pointers, please :)

Link to comment
Share on other sites

Are you more comfortable with arrays, or objects?

 

http://www.mediawiki.org/wiki/Alternative_parsers

 

<?php

// Object

$string = file_get_contents('http://leagueoflegends.wikia.com/api.php?action=query&titles=Caitlyn_the_Sheriff_of_Piltover&prop=revisions&rvprop=content&format=xml');

$xml = new SimpleXMLElement($string);

print_r($xml);

// Array

$string = file_get_contents('http://leagueoflegends.wikia.com/api.php?action=query&titles=Caitlyn_the_Sheriff_of_Piltover&prop=revisions&rvprop=content&format=json');

$json = json_decode($string, TRUE);

print_r($json);

?>

Link to comment
Share on other sites

Are you more comfortable with arrays, or objects?

 

http://www.mediawiki.org/wiki/Alternative_parsers

 

<?php

// Object

$string = file_get_contents('http://leagueoflegends.wikia.com/api.php?action=query&titles=Caitlyn_the_Sheriff_of_Piltover&prop=revisions&rvprop=content&format=xml');

$xml = new SimpleXMLElement($string);

print_r($xml);

// Array

$string = file_get_contents('http://leagueoflegends.wikia.com/api.php?action=query&titles=Caitlyn_the_Sheriff_of_Piltover&prop=revisions&rvprop=content&format=json');

$json = json_decode($string, TRUE);

print_r($json);

?>

 

I was just reading up on file_get_contents just now :D I'm not too familiar with objects yet, but I'm decent with arrays I guess.

 

Also, I thought about getting the data from their main site : http://na.leagueoflegends.com/champions

But there is no way to turn that into XML for me. Is there any way to do a HTTP GET query to turn the page into XML and then use it with cURL in PHP? I mean, I would like to get into the divs and get the data there.

 

I also found this : https://github.com/promisedlandt/loldata/blob/9c214067af09b5fb305d765701af83c7c4f93f4e/lib/loldata/champion.rb

Which does something similar, but is in some other language.

Link to comment
Share on other sites

Breaks their TOA

 

http://na.leagueoflegends.com/legal/termsofuse

F. Transmitting or facilitating the transmission of any content that contains a virus, corrupted data, trojan horse, bot keystroke logger, worm, time bomb, cancelbot or other computer programming routines that are intended to and/or actually damage, detrimentally interfere with, surreptitiously intercept or mine, scrape or expropriate any system, data or personal information

 

You've already got the data... what's the problem?

Link to comment
Share on other sites

Breaks their TOA

 

http://na.leagueoflegends.com/legal/termsofuse

F. Transmitting or facilitating the transmission of any content that contains a virus, corrupted data, trojan horse, bot keystroke logger, worm, time bomb, cancelbot or other computer programming routines that are intended to and/or actually damage, detrimentally interfere with, surreptitiously intercept or mine, scrape or expropriate any system, data or personal information

 

You've already got the data... what's the problem?

1. I have the data of one champion, I need some way to do this for every champion.

2. Need a way to use the data and make variables which can be input into the local database by PHP.

Link to comment
Share on other sites

Okay so I get a LOT of data, some of the chunks are in the following format :

 

|-
|style="text-align:left;" bgcolor="#242424"|{{ci|Ahri}}
|bgcolor="#242424"|Mage
|bgcolor="#102E00"|40
|bgcolor="#420300"|30
|bgcolor="#000A4C"|80
|bgcolor="#30004C"|80
|bgcolor="#242424"|2011-12-14
|bgcolor="#242424"|6300
|bgcolor="#242424"|975

 

Now is there any way (by coding some regex or something) to get whats inside the {{ci|Ahri}} ?

As in, I need a way to take {{ci|Ahri}} and output Ahri and then store it in an array, which I will pass into another file_get_contents.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.