让Google Sitemap生效工作:“[错误]当您尝试访问生成的Sitemap时...我们无法阅读它。”

时间:2013-04-09 15:10:08

标签: seo sitemap google-sitemap

我正试图让Google Sitemap Generator工作。

这是我的(Zend Framework 2)项目结构

/
/...
/public/...
/public/sitemap.xml
/public/urllist.txt
/...
/temp/googlesitemapgen/
/temp/googlesitemapgen/config.xml
/temp/googlesitemapgen/sitemap_gen.py
/...

config.xml中

<?xml version="1.0" encoding="UTF-8" ?>
<site
    base_url="http://foo.bar.loc"
    store_into="/var/www/bar/foo/public/sitemap.xml"
    verbose="3"
    suppress_search_engine_notify="0"
>
    <urllist path="/var/www/bar/foo/public/urllist.txt" encoding="UTF-8" />
</site>

urllist.txt中

http://foo.bar.loc

当我调用生成脚本时

user@machine:/var/www/bar/foo/temp/googlesitemapgen# python sitemap_gen.py --config=config.xmlthon sitemap_gen.py --config=config.xml

发生错误:

user@machine:/var/www/bar/foo/temp/googlesitemapgen# python sitemap_gen.py --config=config.xml 
sitemap_gen.py:65: DeprecationWarning: the md5 module is deprecated; use hashlib instead
  import md5
Reading configuration file: config.xml
BaseURL is set to: http://foo.bar.loc/
Input: From URLLIST "/var/www/bar/foo/public/urllist.txt"
Opened URLLIST file: /var/www/bar/foo/public/urllist.txt
[WARNING] Discarded URL for not starting with the base_url: http://foo.bar.loc
[WARNING] No URLs were recorded, writing an empty sitemap.
Sorting and normalizing collected URLs.
Writing Sitemap file "/var/www/bar/foo/public/sitemap.xml" with 0 URLs
Notifying search engines.
[ERROR] When attempting to access our generated Sitemap at the following URL:
    http://foo.bar.loc/sitemap.xml
  we failed to read it.  Please verify the store_into path you specified in
  your configuration file is web-accessable.  Consult the FAQ for more
  information.
[WARNING] Proceeding to notify with an unverifyable URL.
Notifying: www.google.com
Notification URL: http://www.google.com/webmasters/sitemaps/ping?sitemap=http%3A%2F%2Ffoo.bar.loc%2Fsitemap.xml
Number of errors: 1
Number of warnings: 3

此错误在文档的“Troubleshooting”部分中进行了描述。但我已经检查了base_urlstore_into - 两者都设置正确。

为什么现在出现此错误?我做错了吗?什么?如何让工具正常工作?

THX

1 个答案:

答案 0 :(得分:0)

你需要一个urllist.txt,里面有实际的网址。网站生成器不会为您抓取/抓取您的网站。它可以检查apache日志或引用其他生成的站点地图,但它本身不会抓取。

请参阅我的回答:

https://webmasters.stackexchange.com/questions/47085/is-there-an-xml-sitemap-generator-with-command-line-interface-for-nginx-on-linux/47105#47105

我有一个命令字符串,通过抓取它来生成给定网站的网址列表。