Google Ajax抓取失败

时间:2015-11-22 17:03:34

标签: ajax .htaccess mod-rewrite url-rewriting google-webmaster-tools

我有一个php ajax网站,为我的用户提供页面,如

  http://www.example.com/ => this has all the individual page contents like listing
  http://www.example.com/#!page1-uid => has page1 contents, uid is the unique mongoDB identifier for that page
  http://www.example.com/#!page2-uid => has page2 contents, uid is the unique mongoDB identifier for that page

我希望谷歌抓取我的网站以索引所有大约200多页,但没有一个被索引

我非常关注并理解google ajax抓取方法,但不确定我仍然缺少哪些/哪些。

以下是设置:

的.htaccess

  RewriteCond %{HTTP_USER_AGENT} (googlebot|yahoo|bingbot|baiduspider) [NC]
  RewriteCond %{QUERY_STRING} _escaped_fragment_=(.*)$
  RewriteRule ^(.*)$ botIndex.php?QSA=%1 [QSA,L]

botIndex.php

$var1 = $_REQUEST['QSA'];
checks if QSA is set, if so, serves the individual page1/page2 
else gives out the default home page that has the listing of all the page links

当我使用GWT进行测试时(“以Google搜索”),这是我观察的模式

  a) www.example.com/ => it gets redirected to botIndex.php and returns me all the links (default view) just as expected
  b) www.example.com/#!page1-uid => redirects to the botIndex.php and returns me all the links but ideally it should return the actual page content instead of the home page contents (not sure GWT has the ability to ask for _escaped_fragment_ to mimic googlebot)
  c) www.example.com/?_escaped_fragement_ => GWT returns "Not found" error

通过在botIndex.php中添加几个echo,我怀疑上述请求中没有一个显示“_escaped_fragment_”被捕获 因此我的脚本botIndex.php没有获得QUERY_STRING(QSA)的值来为page1 / page2单独的页面提供服务 默认为显示所有页面列表的主页。

我直接为botIndex.php测试了URL,如

  a) http://www.example.com/botIndex.php?_escaped_fragment_=QSA= (returns all the links )
  b) http://www.example.com/botIndex.php?_escaped_fragment_=QSA=page1-uid (returns the actual page details)

我还缺少什么?

我坚信.htaccess有问题,不可能将QSA传递给我的脚本。

请建议。

更新:我仍然被卡住了。有人可以帮我提一些指示吗?

0 个答案:

没有答案