Jump to content

Any way to identify bots?


RyanSF07

Recommended Posts

Hello,

 

I'm using this plain and simple script below to count page views.  Is there a way to identify a spider bot with php and then do something like..  if this visitor is a bot, don't count it.  If not, count it?

 

 

Here's my simple script:

 

$querySelect = mysql_query("SELECT * FROM  video WHERE video.id = '$_GET[id]'");
$rowcount = mysql_fetch_assoc($querySelect);
$count = $rowcount['counter'];

if (empty($count)) {
$counter = 1;
$insert = mysql_query("INSERT INTO video (counter) VALUES ($counter) WHERE video.id = '$_GET[id]'");
}

$add = $count+1;
$insertNew = mysql_query("UPDATE video SET counter='$add' WHERE video.id = '$_GET[id]'");

 

thanks,

Ryan

Link to comment
Share on other sites

a good bot is hard to spot. bad bots aren't as hard to spot.

 

as far as I know, the most sure-fire way to detect a bot is to require that the visitor execute Javascript code to count as a hit. but with php you can check for things like user agent, reverse dns to try to acquire the remote domain name, other header information that might be available, etc.

Link to comment
Share on other sites

As an aside, there's no reason to run 3 database queries in that script. In phpMyAdmin, alter the table so any future entries get 0 as a default value in the `counter` field then set all of the empty `counter` fields to 0 (only needs to be done once).

UPDATE `video` SET `counter` = 0 WHERE `counter` = '' OR `counter` IS NULL

 

Then change the script above so that when a valid visitor is detected it executes this query (after validating/sanitizing $_GET['id'] of course):

"UPDATE `video` SET `counter` = (`counter` + 1) WHERE `id` = {$_GET['id']}"

Link to comment
Share on other sites

Thank you, Picachu. That worked perfectly.

 

Here's what I have now (below).  My question now is do I have to run this array -- or -- is their some identifying "tag" that all bots have that flags them as a bot? 

 

That way I could just check for that tag, and if it's present -- not count the page view.  Please let me know if you have any ideas.

 

Thank you again for your help. 

Ryan

$botarray = array(   
                "Teoma",                   
                "alexa",
                "froogle",
                "inktomi",
                "looksmart",
                "URL_Spider_SQL",
                "Firefly",
                "NationalDirectory",
                "Ask Jeeves",
                "TECNOSEEK",
                "InfoSeek",
                "WebFindBot",
                "girafabot",
                "crawler",
                "Googlebot",
                "Scooter",
                "Slurp",
                "appie",
                "FAST",
                "WebBug",
                "Spade",
                "ZyBorg");


    foreach($botarray as $botname) {

      if(ereg($botname, $HTTP_USER_AGENT)) {
      
      
              
$recep = "me@yahoo.com";
		$subject = "... bot";
		$text = "$botname";
		$headers = "X-Mailer: PHP\n";
		mail("$recep","$subject","$text","$headers");


}
else
    {
    
    $a = TRUE;
     
    }
     }
   
if ($a) {
mysql_query("UPDATE `video` SET `counter` = (`counter` + 1) WHERE `id` = {$_GET['id']}");

};


 

 

Link to comment
Share on other sites

Nice bots will read and follow your robots.txt file, and they will give you plenty of information in headers to recognize them, including user agent.

 

Bad bots (yandex.ru for one) and data scraper bots will ignore robots.txt and they will give no indication that they are bots except that they (probably) will not execute javascript. They will usually include a regular web browser user agent.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.