Jump to content

PHP-based indexing and search implementation


kristo5747

Recommended Posts

Is there such thing?

 

I designed a while back a rudimentary form based app for my users.

 

We receive from our suppliers hardware manufacturing data in XML files: file name is made of eleven fields separated by tildes, with each field having its own meaning.

 

R&D guys wanted to be able to search each field of the file names so I used regex() with decent results.

 

Problem is that we have now in the upwards of 2.5 million files. And my app can't hack it anymore.

 

I looked at Apache Lucene & Solr. Though it seemed like the best solution to my problem, the fields in the filenames are not peers to the file content. Big no-no with Solr.

 

What is the best way to implement a PHP app with indexing and search capability with such large number of files?

 

Do I have to buy Zend and use Zend_Search? Is it the only way?

 

Thanks for your input.

Link to comment
Share on other sites

At the very least you will want to put that data into a database (MySQL, MongoDB, Postgres), searching through the file is horribly inefficient. You could then attempt to use the database's built in full text search.

 

Best bet for performance and accuracy is to use something 3rd party like Lucene.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.