我有一个简单的PHP页面浏览计数器,并想知道我怎么能阻止蜘蛛和机器人被视为特定谷歌机器人的视图?
答案 0 :(得分:1)
我在我的网站上添加了这个脚本:
# Spiders list from http://linksku.com
$spiders = array('aspseek','abachobot','accoona','acoirobot','adsbot','alexa','alta vista','altavista','ask jeeves','baidu','crawler','croccrawler','dumbot','estyle','exabot','fast-enterprise','fast-webcrawler','francis','geonabot','gigabot','google','heise','heritrix','ibm','iccrawler','idbot','ichiro','lycos','msn','msrbot','majestic-12','metager','ng-search','nutch','omniexplorer','psbot','rambler','seosearch','scooter','scrubby','seekport','sensis','seoma','snappy','steeler','synoo','telekom','turnitinbot','voyager','wisenut','yacy','yahoo');
foreach($spiders as $spider) if(stripos($_SERVER['HTTP_USER_AGENT'], $spider) !== false) {
$_SERVER['HTTP_CRAWLER'] = true;
break;
}
if(!isset($_SERVER['HTTP_CRAWLER'])) $_SERVER['HTTP_CRAWLER'] = false;
然后你可以检查$_SERVER['HTTP_CRAWLER']
的值并阻止你的脚本计算命中。
答案 1 :(得分:0)
一种简单的方法是将pagecounter实现为图像脚本:
<img src="counter.php" width="1" height="1" alt="Oh I'm just counting">
并通过robots.txt
将该网址标记为蜘蛛/抓取工具无法访问,您可以在其中使用*
,或仅Googlebot
将其排除在外:
User-agent: *
Disallow: /counter.php
另一种方法是只检查脚本中的stristr($_SERVER["HTTP_USER_AGENT", "Googlebot")
,然后根本不启动计数器。