Nuv Posted March 19, 2011 Share Posted March 19, 2011 I am extracting data from one of the sites.I would join these sentences separated by , (comma). The solution that comes to my head is by inserting the text in the file and then using file functions.However, i was wondering if i can use str_replace here . Can anyone please point me in the right direction. Hospital 30-Day Death (Mortality) Rates from Heart Attack: 15.2% (No different than U.S. National Rate) Hospital 30-Day Readmission Rates from Heart Attack: % (Number of Cases Too Small*) Hospital 30-Day Death (Mortality) Rates from Heart Failure: 10.5% (No different than U.S. National Rate) Hospital 30-Day Readmission Rates from Heart Failure: 29.9% (Worse than U.S. National Rate) Hospital 30-Day Death (Mortality) Rates from Pneumonia: 9.8% (No different than U.S. National Rate) Hospital 30-Day Readmission Rates from Pneumonia: 20.6% (No different than U.S. National Rate) Quote Link to comment Share on other sites More sharing options...
kenrbnsn Posted March 19, 2011 Share Posted March 19, 2011 How are you getting these strings? Please post your code. Ken Quote Link to comment Share on other sites More sharing options...
Nuv Posted March 19, 2011 Author Share Posted March 19, 2011 Below is my code. Its very noobish code I understand i can use curl but i thought since i have no password/username to enter, file_get_contents should be okay. (I haven't used CURL so probably scared of it.) Since i have posted my code please may i know how should i improve it ? Also the "address" i am getting has a line break too which i would like to replace it(line breaks, white spaces) with comma. <?php include("config.php"); $data = "<td><a href='/hospitals-in/Alabama'>Alabama</a></td> <td><a href='/hospitals-in/District-of-Columbia'>District of Columbia</a></td> <td><a href='/hospitals-in/Kentucky'>Kentucky</a></td> <td><a href='/hospitals-in/Montana'>Montana</a></td> <td><a href='/hospitals-in/Ohio'>Ohio</a></td> <td><a href='/hospitals-in/Texas'>Texas</a></td> <td><a href='/hospitals-in/Alaska'>Alaska</a></td> <td><a href='/hospitals-in/Florida'>Florida</a></td> <td><a href='/hospitals-in/Louisiana'>Louisiana</a></td> <td><a href='/hospitals-in/Nebraska'>Nebraska</a></td> <td><a href='/hospitals-in/Oklahoma'>Oklahoma</a></td> <td><a href='/hospitals-in/Utah'>Utah</a></td> <td><a href='/hospitals-in/America-Samoa'>America Samoa</a></td> <td><a href='/hospitals-in/Georgia'>Georgia</a></td> <td><a href='/hospitals-in/Maine'>Maine</a></td> <td><a href='/hospitals-in/Nevada'>Nevada</a></td> <td><a href='/hospitals-in/Oregon'>Oregon</a></td> <td><a href='/hospitals-in/Vermont'>Vermont</a></td> <td><a href='/hospitals-in/Arizona'>Arizona</a></td> <td><a href='/hospitals-in/Hawaii'>Hawaii</a></td> <td><a href='/hospitals-in/Maryland'>Maryland</a></td> <td><a href='/hospitals-in/New-Hampshire'>New Hampshire</a></td> <td><a href='/hospitals-in/Pennsylvania'>Pennsylvania</a></td> <td><a href='/hospitals-in/Virgin-Islands'>Virgin Islands</a></td> <td><a href='/hospitals-in/Arkansas'>Arkansas</a></td> <td><a href='/hospitals-in/Idaho'>Idaho</a></td> <td><a href='/hospitals-in/Massachusetts'>Massachusetts</a></td> <td><a href='/hospitals-in/New-Jersey'>New Jersey</a></td> <td><a href='/hospitals-in/Puerto-Rico'>Puerto Rico</a></td> <td><a href='/hospitals-in/Virginia'>Virginia</a></td> <td><a href='/hospitals-in/California'>California</a></td> <td><a href='/hospitals-in/Illinois'>Illinois</a></td> <td><a href='/hospitals-in/Michigan'>Michigan</a></td> <td><a href='/hospitals-in/New-Mexico'>New Mexico</a></td> <td><a href='/hospitals-in/Rhode-Island'>Rhode Island</a></td> <td><a href='/hospitals-in/Washington'>Washington</a></td> <td><a href='/hospitals-in/Colorado'>Colorado</a></td> <td><a href='/hospitals-in/Indiana'>Indiana</a></td> <td><a href='/hospitals-in/Minnesota'>Minnesota</a></td> <td><a href='/hospitals-in/New-York'>New York</a></td> <td><a href='/hospitals-in/South-Carolina'>South Carolina</a></td> <td><a href='/hospitals-in/West-Virginia'>West Virginia</a></td> <td><a href='/hospitals-in/Connecticut'>Connecticut</a></td> <td><a href='/hospitals-in/Iowa'>Iowa</a></td> <td><a href='/hospitals-in/Mississippi'>Mississippi</a></td> <td><a href='/hospitals-in/North-Carolina'>North Carolina</a></td> <td><a href='/hospitals-in/South-Dakota'>South Dakota</a></td> <td><a href='/hospitals-in/Wisconsin'>Wisconsin</a></td> <td><a href='/hospitals-in/Delaware'>Delaware</a></td> <td><a href='/hospitals-in/Kansas'>Kansas</a></td> <td><a href='/hospitals-in/Missouri'>Missouri</a></td> <td><a href='/hospitals-in/North-Dakota'>North Dakota</a></td> <td><a href='/hospitals-in/Tennessee'>Tennessee</a></td> <td><a href='/hospitals-in/Wyoming'>Wyoming</a></td>"; preg_match_all("~<td><a\s+href='(.*?)'>(.*?)</a></td>~", $data, $link); $countlink = count($link[1]); for($i=0 ; $i < 1; $i++) // here $i < $countlink ; but using $i < 1 for testing purposes { $sitelink = "http://www.ushospitalfinder.com".$link[1][$i]; $hospitallink = file_get_contents("$sitelink"); preg_match_all("~<td><a href=\"/hospital/(.*?)\">(.*?)</a></td>~", $hospitallink, $hospitalinfo); $countinfo = count($hospitalinfo[1]); for($j=0 ; $j < 1 ; $j++) // here $j < $countinfo ; but using $j < 1 for testing purposes { $infolink = "http://www.ushospitalfinder.com/hospital/".$hospitalinfo[1][$j]; $getinfo = file_get_contents("$infolink"); $regex = "~<b>Name:</b>\s+(.*?)\s+</p>\s+<p>\s+<b>Address:</b>\s+(.*?)\s+</p>\s+<p>\s+<b>Phone:</b>\s+(.*?)\s+</p>\s+<p>\s+<b>Number\s+of\s+Beds:</b>\s+(.*?)\s+</p>\s+<p>\s+<b>Type:</b>\s+(.*?)\s+</p>\s+<p>\s+<b>System:</b>\s+(.*?)\s+</p>\s+<p>\s+<b>Website:</b>\s+<a href=\"(.*?)\">(.*?)</a>\s+</p>\s+<p>\s+(.*?)</p>~s"; preg_match_all($regex, $getinfo, $critinfo); preg_match_all("~<li>\s+(.*?)</li>~s", $getinfo, $servinfo); preg_match_all("~<h4>Hospital\s+Quality\s+and\s+Rating\s+information</h4>\s+<p>Data\s+based\s+on\s+2010\s+Health\s+Quality\s+Alliance\s+database</p>\s+<p>\s+<b>(.*?)</b><br>\s+(.*?)</p>~s", $getinfo, $mortality); $name = $critinfo[1][0]; $address = $critinfo[2][0]; $phone = $critinfo[3][0]; $beds = $critinfo[4][0]; $type = $critinfo[5][0]; $system = $critinfo[6][0]; $link = $critinfo[7][0]; $linktext = $critinfo[8][0]; $accredited = $critinfo[9][0]; $servinfo = implode(",", $servinfo[1]); echo'<pre>'; echo print_r($mortality); echo'</pre>'; $sql = "INSERT INTO hospital (name, address, phone, beds, hospitaltype, system, link, linktext, accredited, servinfo) VALUES ('$name', '$address', '$phone', '$beds', '$type', '$system', '$link', '$linktext', '$accredited', '$servinfo')"; // $exec_sql = mysql_query($sql); } } ?> Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.