torontobb Posted February 23, 2011 Share Posted February 23, 2011 Hi Everyone, I have just started using Simple HTML DOM today and I have spent 4 hours not getting what I want. I want to be able to extract the following information: <div class="listing_content"> <span class="serialNumb" style="line-height: 21px;">77777</span> <br /> 444 ASDF, Alpha, Tango, Beta <br /> 77777 Director:99999 <div> <img title='web' src='http://cpgimg.com/images/icon_sm_web.gif' alt='web'/> <a href='javascript:void(0)' onClick="window.open('/redir.jsp?p_url=http:%2f%2fwww.cnn.com&p_cid=2707304&p_hid=279E00&p_ct=3527&p_pr=KO&p_fr=U');" class='listing_link'>website</a> <img title='email' src='http://cpgimg.com/images/icon_sm_mail.gif' alt='email'/> <a class='listing_link' href="javascript:void(0)" onclick="popupEmail('/email.jsp?lang=0&p_cid=2707304');(new Image()).src='/redir.jsp?p_url=&p_cid=2707304&p_hid=279E00&p_ct=3527&p_pr=ON&p_fr=E&msec='+(new Date()).getMilliseconds()">E-mail</a> </div> </div> The content I need to pull separately from above include: 1- serialNumb = 77777 2- 444 ASDF, Alpha, Tango, Beta 3- 77777 Director:99999 4- www.cnn.com I want all the data to recorded to different variables so I can upload them to MySQL. Any help with this is much appreciated. I don't have to use Simple DOM HTML but per my search it seems to be the best tool (however, I am not so lucky with it.) ***Not to forget that this page is full of <div>, <br />, <img>, and other tags. The quoted part is just one excerpt but this part is unique and used once in the page "style="line-height: 21px;". Also the "('/redir.jsp?p_url" is also unique for the URL portion. Thanks again. Quote Link to comment Share on other sites More sharing options...
sunfighter Posted February 23, 2011 Share Posted February 23, 2011 You need to read up on forms : http://www.w3schools.com/html/html_forms.asp Since forms are so basic to web sites I think you should study html on this site also. It's a good site for learning. Quote Link to comment Share on other sites More sharing options...
torontobb Posted February 24, 2011 Author Share Posted February 24, 2011 Sorry, what about forms? I need to parse a HTML page. Nothing to do with forms. Am I missing something? I need to parse the HTML page and grab some data based on the pattern. Thanks Quote Link to comment Share on other sites More sharing options...
sunfighter Posted February 25, 2011 Share Posted February 25, 2011 forms because you said "I want all the data to recorded to different variables so I can upload them to MySQL." That's what forms are for. Did you look at the page and site i sent you to? Quote Link to comment Share on other sites More sharing options...
torontobb Posted February 25, 2011 Author Share Posted February 25, 2011 I have had a look at it, but I think you took the little minor part of my post that is not an issue to me and pointed me to it. I need to do PARSING of html file. That is it in nutshell. I have already overcome a lot of issues. But I have issue with space available in the html file. Anyone who has experience with HTML PARSING please let me know how you would parse out the address out of this excerpt of an html (***Notice- All the spaces exist in the html source file like quoted here): <span class="basic_serial">(777) 777-7777</span> <br /> 1111 ABCD, EFGH, IJKL <br /> Thanks, Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.