dilbertone Posted May 21, 2011 Share Posted May 21, 2011 hello dear community _ good evening! For the purpose of scraping this dataset with ++ 2700 records on foundation - in Switzerland you see it here http://www.edi.admin.ch/esv/00475/00698/index.html?lang=de <?PHP // Original PHP code by Chirp Internet: www.chirp.com.au // Please acknowledge use of this code by including this header. $url = "http://www.edi.admin.ch/esv/00475/00698/index.html?lang=de"; $input = @file_get_contents($url) or die("Could not access file: $url"); $regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>"; if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) { foreach($matches as $match) { // $match[2] = all the data i want to collect... // $match[3] = text that i need to collect - see a detail-page } } ?> well to be frank - i am not sure - my console gives back some bad errors... can you help me please in this issue. love to hear from you db1 btw: see a detailpage: http://www.edi.admin.ch/esv/00475/00698/index.html?lang=de&webgrab_path=http://esv2000.edi.admin.ch/d/entry.asp?Id=3221 with the following information: Name: "baiji.org" Foundation Schlüsselwort: BAIJI Adresse: Seefeldstr. 94 8008 Zürich Mail: august@baiji.com Zweck: btw: see a translation; Name: - > name Schlüsselwort: - keyword Adresse: - adress Mail: - mail Zweck: - purpose Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.