Jump to content

determining which words are searched the most


Smudly

Recommended Posts

Hey, I have a search feature that searches for words in my database.

I am currently storing all searches that users make in a table called `search`. I need to list the top 10 words that are searched for the most, excluding common words like "the, and, or, etc".

Because the searches users make are stored into a table, it should be fairly easy.

Could anyone give an example of how to do this?

 

Here's how my sql table looks:

 

table name: search

 

columns:

id (autoincrement)

phrase

date

resultsfound

ip

 

 

Link to comment
Share on other sites

 

$common_words = array(); // An array of words you don't want to include in the top search results.$search_count = array();$result = mysql_query('SELECT phrase FROM search', $db); // Change $db to your database variable.while ($result_array = mysql_fetch_assoc($result)) {   $search_array = explode(' ', $result_array['phrase']);   $search_array = array_diff($search_array, $common_words);   foreach ($search_array as $value) {      $search_count[$value]++;   }}sort($search_count, SORT_NUMERIC);reset($search_count);for($i=0; $i < 10; $i++) {   if (key($search_count))      echo key($search_count); // This is where the top $i+1'th search term is outputted.   else      $i=99;   next($search_count);}

 

 

That should do it, but it's untested.

Link to comment
Share on other sites

That is going to be a hugh resource hog over time. Not only do you have to loop through all the records, you also have to explode the records and loop over that result. You shouldn't have to loop through all the records to find the top items - a database can do that dynamically. But, you can't do it with the current database structure.

 

I think you can make this much easier with one small change in the db. Create another table called search_words with the following columns:

- search_id (foreign key back to the seach table)

- word

- count

 

Then whenever anyone performs a search, in addition to adding a record to the search table you would explode the search term and INSERT the words into this secondary table using ON DUPLICATE KEY so new words will be added while existing words will have their count incremented (the word field would be the key). After creating the table you coul dalso create a one time script to add the records from all the current searches.

 

Then, if you need to words most commonly searched you can just do a single query of the search_words table ordering the results according to the count value.

Link to comment
Share on other sites

Just to clarify, my solution is to establish a current listing of search terms for a database that is already in use. It is not a solution that should be run repeatedly on a live server.

 

I wasn't knocking your code, just pointing out that the current database design makes this a very resource intensitive tasks. Running a script such as the one you provided once a night to update the results would be one way to mitigate the problems. Although I think by making a small chnage to the database and adding the ability to capture the metrics in real time will provide the best results.

Link to comment
Share on other sites

Thanks for all the ideas guys. I ended up creating an additional table which captures all words separately. I can then increment each one of these rows when the word comes up in the search.

This seems to be the best way for me.

 

If anyone is interested, here is the code I used. Maybe it could help you in the future:

 

	$word = explode(" ", $var);
$num = 0;
foreach($word as $key=>$value){

	$wordexist = mysql_query("SELECT word FROM searchedwords WHERE word='$word[$num]'");
	$wordcount = mysql_num_rows($wordexist);
	if($wordcount!=0){
	//UPDATE
		$wordget = mysql_query("SELECT * FROM searchedwords WHERE word='$word[$num]'");
		$wordrow = mysql_fetch_assoc($wordget);
		$todayword = $wordrow['today'];
		$totalword = $wordrow['total'];
		$newtoday = $todayword+1;
		$newtotal = $totalword+1;

		$updateword = "UPDATE `searchedwords` SET `today`='$newtoday', `total`='$newtotal' WHERE `word`='$word[$num]'";
		mysql_query($updateword);
		$num++;
	}
	else{
		$addone = 1;
		$wordinsert = mysql_query("INSERT INTO searchedwords VALUES ('','$word[$num]','$addone','$addone')");
		$num++;
	}

}

 

So when somebody types in "Php Freaks Is Cool", it checks if the word is in the searchedwords table. If it is, it just increments the `today` field as well as the `total` field. Then I have a cron job that resets the today fields every day at midnight. Capturing this information on an admin page is also very simple. You just use something like:

 

 

 


// This query ignores the words in the parenthesis (the, and, or, a) etc. 
$result1 = mysql_query("SELECT * FROM searchedwords WHERE word NOT IN ('the', 'and', 'or', 'a', 'in', 'of', 'to', 'for', 'is', 'we', 'are', 'that', 'have', 'been') ORDER BY today DESC"); 



while($row1 = mysql_fetch_array($result1)){
	if(($i%2)==0){
		$bgcolor = "#f5f5f5";
	}
	else{
		$bgcolor = "#ccddff";
	}
	  echo "<tr>";
	  echo "<td align='center' valign='top' width='50px' bgcolor='$bgcolor'>" .$row1['id']. "</td>";
	  echo "<td align='center' valign='top' width='250px' bgcolor='$bgcolor'>" .$row1['word']. "</td>";
	  echo "<td align='center' width='100px' bgcolor='$bgcolor'>".$row1['today']."</td>";
	  echo "<td align='center' width='100px' bgcolor='$bgcolor'><span style='color: $rescolor;'>".$row1['total']."</span></td>";
	  echo "</tr>";
	  $i++;	
}

Link to comment
Share on other sites

Ok, you really don't want to run queries in a loop. It really puts a strain on the server. And that logic is very overcomplicated.

 

Not sure why you are tracking two different totals, one for today and another for total. I assume you have a nightly query that runs to empty the today value. Anyway, set the field 'today' and 'total' to have default values of 1. You only then need three fields: word, today and total. The word field should be the primary field. You can then do what you need with just the following:

//process the words into array of query values
$valueParts = array();
foreach(explode(' ', trim($var)) as $word)
{
    if($word!='') { $valueParts[] = "('{$word}')"; }
}
//Create combined query
$VALUES = implode(', ', $valueParts)
$query = "INSERT INTO searchedwords (`word`)
          VALUES {$VALUES}
            ON DUPLICATE KEY UPDATE today=today+1, total=total+1";
$result = mysql_query($query) or die(mysql_error());

 

The result of that much shorter and efficinet script would be that any word which does not ecurrently exist in the table will be added with values of '1' for today and total. If a word does exist, the values for today and total will be incremented by 1.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.