我尝试使用以下代码通过htaccess阻止不良的bot: 我知道这是两种方法,但是它们都不起作用,我仍然在访问日志中看到这些机器人:我在做什么错了?
RewriteCond %{HTTP_USER_AGENT} ^BLEXBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SemrushBot [NC,OR]
SetEnvIfNoCase User-Agent "BLEXBot" rotbot
SetEnvIfNoCase User-Agent "SemrushBot" rotbot
<Limit POST GET HEAD PUT>
Order Allow,Deny
Allow from all
Deny from env=rotbot
</Limit>
访问日志中的条目如下所示:
domain.org:443 46.229.168.142 - - [22/Jul/2019:08:56:26 +0200] "GET /path/to/page/ HTTP/1.1" 403 3801 "-" "Mozilla/5.0 (compatible; SemrushBot/3~bl; +http://www.semrush.com/bot.html)"
domain.org:443 94.130.219.232 - - [22/Jul/2019:08:56:24 +0200] "GET /path/to/page/ HTTP/1.1" 403 760 "-" "Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)"
答案 0 :(得分:0)
将这些规则固定为:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BLEXBot [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SemrushBot [NC]
RewriteRule ^.* - [F,L]
</IfModule>