Question

我已将spiderable package软件包添加到我的Meteor应用程序中，并且在网址中使用?_escaped_fragment_=发出请求时会返回该页面的html版本，但我无法获取Google抓取网站。

详细

在Fetch as Google中使用Google Webmaster Tools并请求根页"http://example.com/"时，页面返回是javascript版本;有点像：

HTTP/1.1 200 OK
content-type: text/html; charset=utf-8
date: Fri, 30 Nov 2012 05:39:36 GMT
connection: Keep-alive
transfer-encoding: chunked

<!DOCTYPE html>
<html>
  <head>
    <link rel="stylesheet" href="/e83157bdc4ff057fa3a20b82af4c11b4ebe776e7.css">
    <script type="text/javascript">
      __meteor_runtime_config__ = {"ROOT_URL":"http://www.example.com","DEFAULT_DDP_ENDPOINT":"https://www-example-com-ddp.meteor.com/"};
    </script>
    <script type="text/javascript" src="/13cf3d21ce1c4a88407ca5f3c250f186ab1738f9.js"></script>
    <meta name="fragment" content="!">
    <title>example.com</title>
  </head>
<body>
</body>
</html>

如果相反，我请求http://example.com/?_escaped_fragment_=返回html版本：

HTTP/1.1 200 OK
content-type: text/html; charset=UTF-8
date: Wed, 05 Dec 2012 02:44:09 GMT
connection: Keep-alive
transfer-encoding: chunked

<!DOCTYPE html>
<html>
  <head>
    <link rel="stylesheet" href="/e83157bdc4ff057fa3a20b82af4c11b4ebe776e7.css">
    <title>example.com</title>
    <meta name="viewport" content="initial-scale=1.0">
  </head>
  <body>
    <ul>
      <li><a href="/">Home</a></li>
      <li><a href="/one">One</a></li>
      <li><a href="/two">Two</a></li>
    </ul>
  </body>
</html>

问题

您如何告诉Google将?_escaped_fragment_=添加到网址，以便呈现html版本？
Google是否仍会将?_escaped_fragment_=添加到网址，如果网址不有hashbangs（！＃）？即/home，/products/1而不是/!#home，/!#products/1？
如何让Google关注关联的网页？并追加?_escaped_fragment_=？页面的所有js版本在标题中都有<meta name="fragment" content="!">。我认为只需要这一切。

似乎最简单的解决方案是更新splerable包的更新以将html版本返回给Google Bot，而不是要求?_escaped_fragment_=，但如果这对其他人有效，我很好奇，至于我做错了什么。

其他信息

Meteor's spiderable package是允许网络搜索引擎为Meteor应用程序编制索引的临时解决方案。

根据source，它做了一些事情：

它将以下标记添加到页面的js版本的head部分：

<head><meta name="fragment" content="!"></head>
使用PhantomJS解析javascript应用程序并在满足以下任一条件时返回html版本：

一个。请求用户代理是"facebookexternalhit"

湾请求的网址包含字符串?_escaped_fragment_=

Answer 1

我认为这是一个“Google网站管理员工具”错误。

Google似乎确实正在抓取该网站 - 这些网页显示在Google搜索结果中。但是，Google网站管理员工具仍然将所有索引页面列为1.然而，Bing仍未抓取该页面。

修改谷歌网站管理员工具页面列为

未选中：未编入索引的页面，因为它们与其他页面基本相似，或者已重定向到其他网址。 More information

EDIT2：回应Jonatan的提问：

如果网址没有hashbang（！＃），Google仍会将?_escaped_fragment_=添加到网址吗？

是。我的应用程序不在URL中使用hashbangs（！＃）。 Google僵尸程序在抓取时仍会附加?_escaped_fragment_=。以下是日志的示例：

INFO HIT /url/2/01 66.249.72.42 INFO HIT /url/2/01?_escaped_fragment_= 66.249.72.142 INFO HIT /url/2/01 108.162.222.82 INFO HIT /url/2/01?_escaped_fragment_= 108.162.222.82 INFO HIT /url/2/05 108.162.222.82 INFO HIT /url/2/05?_escaped_fragment_= 108.162.222.214

Google僵尸程序似乎会尝试使用和不使用?_escaped_fragment_=
的网址

Answer 2

任何没有以#!开头的散列片段的页面，例如主页，都需要这样：

 <meta name="fragment" content="!">

通知抓取工具获取丑陋的网址（with _escaped_fragment_=）。显然它会进入<head>部分。

更新：我注意到根据问题末尾给出的插件说明添加了上面的元标记，您可以通过显示源代码来检查它是否包含在您的页面中。

通常情况下，除了主页之外的所有其他页面都应该在漂亮的URL中有www.yoursite.com/#!hashfragment，其中散列后的!（#）用作爬网程序的通知程序，所以您不需要包含上面提到的元标记。

Answer 3

我知道这个问题已经得到了回答，但对于谷歌来这个问题的人来说。我想在这个主题中加入这个截屏视频。

这帮助我了解了流星的spiderable包。 https://www.eventedmind.com/tracks/feed-archive/meteor-the-spiderable-package

Spiderable包如何与Meteor一起使用？

详细

问题

其他信息

3 个答案:

如果网址没有hashbang（！＃），Google仍会将`?_escaped_fragment_=`添加到网址吗？

Spiderable包如何与Meteor一起使用？

详细

问题

其他信息

3 个答案:

如果网址没有hashbang（！＃），Google仍会将?_escaped_fragment_=添加到网址吗？

如果网址没有hashbang（！＃），Google仍会将`?_escaped_fragment_=`添加到网址吗？