could you just explain why you escape the single quotes? obviously the double ones need to be escaped here, but the single ones?
Some characters should always be escaped and some only need to be escaped sometimes. But, it doesn't cause a problem to escape a character when it doesn't need to be. So instead of having to put any though into it, I just escape some characters by default. Saves time - plus it makes the code more regression proof. What if some yahoo was modifying the code later and decided to define the pattern with single quotes rather than double quotes? My pattern would keep on working whereas if the single quotes were not escaped it would break.
also sorry to be a pest, but why does the following not work, to me it looks like its doing a similar thing, looking for a single quote, then returning whats between them
preg_match("%'([.*]*)'%", $_line, $_matches);
How is that similar? It *looks* like you are trying to match all characters between two single quotes. Which this would be the correct pattern: "%'(.*)'%"
However, that will NOT work for what you want. For one it doesn't care where those quotes exist. Two it would not find the text in double quotes which I assume you want also. Third, and most importantly it finds the text from the first single quote to the LAST single quote. Also, preg_match() will stop after the first match instead of finding all matches.
If this was your text:
<a href='somesite1.com'>Some Site 1</a> <a href='somesite2.com'>Some Site 2</a>The regex above would return this:
somesite1.com'>Some Site 1</a> <a href='somesite2.comBecause the * modifier is "greedy" - it will match all characters until the last match. You could make it non-greedy by also using the ? after the *: "%'(.*?)'%"
But, my understanding is that using that method is not efficient and you should use the method I described above of matching all characters that do not mathc the ending character: "%'([^']*)%"