Jump to content

PHP Search Engine


c_pattle

Recommended Posts

I thinking about building a search engine just as a fun project and a way to develop my skills.  However I'm not sure where to start... 

 

I want it to be a comparison search engine for computer parts so say the user type in "RAM" it would display results from PC World, Ebuyer, etc.  I was thinking of using cURL but how do I know where the search results are on each page.  For example say  I get a html page of results from PC World and store them in a variable how do I then know what to strip out to get the results I want?  Would I just have to look at the source code?  Also would I then have to work this out for each site I wanted to search from and have a different method of getting the results from search site?

 

If anyone has done anything like this in the past and could give me some advice that would be great. 

 

Thanks

Link to comment
Share on other sites

To get the results from a website, you'll need to scrape it from the website. When web scraping you should read the terms of use before doing so otherwise you may face a lawsuit.

 

You can scrape the content using DOMDocument or using regular functions like file_get_contents() and fopen().

 

$content = file_get_contents('http://www.domain.tld');

 

Using this or the DOMDocument technique is memory intensive as it loads all content into memory. I prefer to use a combination of fopen() and fread() to read the data into chunks and conserve memory in the process.

Link to comment
Share on other sites

I use curl and mysql for mine.

There's actually a lot more involved as one would think there was to it.

Takes lots of time to connect to all the sites just to get all the information you want from them to save.

 

Pulling in random links from various websites, storing them, and just showing the latest discovered links by date is easier. (like a normal search engine is)

 

I would suggest using cassandra as a database and python to do the searching of the data if wanted to do a serious search engine.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.