E107重定向facebook抓取错误

时间:2012-05-25 12:21:24

标签: facebook .htaccess redirect scraper

这是我的.htaccess文件:

RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} !^/?(usersettings\.php|page\.php|news\.php|signup\.php|admin/|plugins/forum/|plugins/.*/.*config\.php)
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?lf1medsoc\.com [NC]
RewriteRule .* - [F,L]

# 2. Redirect all access to the following user agents and files
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.76\ \[ru\]\ \(X11;\ U;\ SunOS\ 5\.7\ sun4u\) [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5.0$ [OR]
RewriteCond %{HTTP_USER_AGENT} (Bot\ Search|kangen|CaSpEr|MaMa|crew|plaNETWORK|dex|perl\ post$) [NC,OR]
RewriteCond %{REQUEST_URI} (contact\.php|help_us\.php|forum_index\.php|crossdomain\.xml|\.htaccess)
RewriteRule .* http://%{REMOTE_ADDR}/ [R,L]

# 3. Deny access to requests with contact.php or help_us.php in the query
# string, UNLESS those are referred from our own site (e.g. search)
RewriteCond %{QUERY_STRING} (contact\.php|request\.php\help_us\.php|casper)
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?lf1medsoc\.com [NC]
RewriteRule .* - [F,L]

# 4. Redirect empty user agent, UNLESS it's accessing the RSS feed
RewriteCond %{HTTP_USER_AGENT} ^$ 
RewriteCond %{REQUEST_URI} !^/?e107_plugins/rss_menu/rss.php
RewriteRule .* http://%{REMOTE_ADDR}/ [R,L]

# 5. Deny access to these files UNLESS referred from our site.
RewriteCond %{REQUEST_URI} ^/?(top|download|user|search|submitnews|fpw)\.php
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?lf1medsoc\.com [NC]
RewriteRule .* - [F]

Facebook {l} http://www.lf1medsoc.com/page.php?19的结果(可公开访问,无需登录等):

WHOLE result page

  

抓取信息

     

回应代码:200   提取的网址:http://www.lf1medsoc.com/page.php?19   规范网址:http://www.lf1medsoc.com/最终版   网址:http://www.lf1medsoc.com/page.php?2必须修复的错误

     

循环重定向路径:检测到循环重定向路径(有关详细信息,请参阅“重定向路径”部分)。

     

重定向路径

     

原文:http://www.lf1medsoc.com/page.php?19   og:url:http://www.lf1medsoc.com/   302: http://www.lf1medsoc.com/page.php?2   og:url:http://www.lf1medsoc.com/   最终的网址是粗体(这是   我们尝试从中提取元数据的URL)。属于的URL   圆形重定向路径突出显示。

我错过了htaccess中的内容吗? 有没有办法添加facebook useragent以允许加入

注意:page.php?2是主页(重定向来自lf1medsoc.com - > index.php - > page.php?2)

1 个答案:

答案 0 :(得分:1)

{og:http://www.lf1medsoc.com/page.php?2上的网址标记指向http://www.lf1medsoc.com/

将其更改为http://www.lf1medsoc.com/page.php?2