Author Topic: This Reg no longer works?  (Read 367 times)

0 Members and 1 Guest are viewing this topic.

Offline ModernvoxTopic starter

  • Enthusiast
  • Posts: 411
  • Gender: Male
    • View Profile
This Reg no longer works?
« on: February 09, 2010, 02:53:48 PM »
 preg_match_all('/<a href="([^"]+)">([^<]+)<\/a><font size="-1">([^"]+)<\/font>/s', $html,$posts,PREG_SET_ORDER);
Here's a few of the target URL's
<p><a href="http://southcoast.craigslist.org/muc/1564255288.html">Drummer looking for weeknight gigs</a> - <font size="-1"> (New Bedford)</font></p>
<p><a href="http://southcoast.craigslist.org/muc/1564167149.html">Pagan Musicians</a> - </p>
<p><a href="http://southcoast.craigslist.org/muc/1564061446.html">Seeking 5th member</a> - <font size="-1"> (RI/Southern, MA)</font></p>

<p><a href="http://southcoast.craigslist.org/muc/1563926651.html">Gigging cover band in search for new lead guitarist </a> - <font size="-1"> ((south shore))</font></p>
<p><a href="http://southcoast.craigslist.org/muc/1563506659.html">Acoustic Guitarist Wanted</a> - <font size="-1"> (New Bedford/Fall River/Providence/East Bay area)</font></p>
<p><a href="http://southcoast.craigslist.org/muc/1563233552.html">Need Help Writing Raps?</a> - <font size="-1"> (Fall River, Ma)</font> <span class="p"> pic</span></p>

<h4>Wed Jan 20</h4>
<p><a href="http://southcoast.craigslist.org/muc/1562404109.html">drums and guitar looking for bass w/ vocals</a> - <font size="-1"> (taunton)</font></p>
<p><a href="http://southcoast.craigslist.org/muc/1562389093.html">wack ass egyptians need guitarist</a> - <font size="-1"> (quincy/whitman)</font></p>
<p><a href="http://southcoast.craigslist.org/muc/1561458375.html">Looking for a few good men - Bass/Baritones</a> - <font size="-1"> (Fall River Area)</font> <span class="p"> pic</span></p>

<h4>Tue Jan 19</h4>
<p><a href="http://southcoast.craigslist.org/muc/1561104614.html">singer/guitarist looking</a> - </p>
<p><a href="http://southcoast.craigslist.org/muc/1560864071.html">south shore cover band needs bass</a> - <font size="-1"> (plymouth,ma)</font></p>
<p><a href="http://southcoast.craigslist.org/muc/1559645835.html">Looking for Rhythm Guitarist</a> - <font size="-1"> (Taunton, Ma)</font></p>

<h4>Mon Jan 18</h4>
<p><a href="http://southcoast.craigslist.org/muc/1558191492.html">Working Cover Rock Band Looking for GOOD Lead Singer</a> - <font size="-1"> (SE MA/RI)</font></p>
<p><a href="http://southcoast.craigslist.org/muc/1557842807.html">wanted: guitar player (christian)</a> - <font size="-1"> (Dartmouth)</font></p>


Here's my code which worked fine up until yesterday. Now it only works when I strip the <font syntax at the end.
Code: [Select]
<?php
error_reporting
(E_ALL);
ini_set("display_errors"1);
$st = isset($_POST['submit']) ? $_POST['state'] : '';

$urls= array("http://" $st ".craigslist.org");
foreach (
$urls as $url) {
    
$html file_get_contents("$url/muc/");


   
preg_match_all('/<a href="([^"]+)">([^<]+)<\/a><font size="-1">([^"]+)<\/font>/s'$html,$posts,PREG_SET_ORDER);
    
//echo "<pre>";print_r($posts);
$i 1//set start point;
$limit 60//set limit;
foreach ($posts as $post) {
  
//print_r $post[0]; //HTML
   
$post[2] = str_ireplace($url,"",$post[2]); //remove domain
  
echo "<a href=\"$url{$post[1]}\" target=\"_blank\">{$post[2]}<font size=\"3\">{$post[3]}</font></a><br />";
   print 
"<BR />\n";


   if (
$i == $limit)
   {
      break;
   }
  
$i++; 
}

}
?>

[code]

When I remove <font size="-1">([^"]+)<\/font> ir works however it displays all some thinks I don't want as before this Regex worked perfect?

Thanks in advance
If I wanted any shit out of you i'd squeeze your fuckin head!
JustHost Sucks!

Offline MadTechie

  • Guru
  • Freak!
  • *
  • Posts: 9,374
  • Gender: Male
  • I try to F1
    • View Profile
Re: This Reg no longer works?
« Reply #1 on: February 09, 2010, 10:12:16 PM »
That's because they added a -
a simple RegEx update would be
Code: [Select]
<a href="([^"]+)">([^<]+)</a> - (?:<font size="-1">([^"]+)</font>)?
Also i have pointed this out before but do you have permission to collect this data ?
as if you don't it would be unlawful!
Computers are good at following instructions, but not at reading your mind.
The quality of a response, is usually directly related to the quality of the question. ©2009 mjdamato
I dunno about that.  A regular expression has a 0% chance of touching my penis.
the code is professionally made up but not working
Remember to Click Solved, how to ask questions - the smart way