“编程集体智慧”的例子不起作用

时间:2013-03-31 13:51:08

标签: python search search-engine ranking

我从here下载了源代码。我试图从Toby Segaran的“编程集体智慧”一书的第4章中运行这个例子。我的python版本是2.7.2。我输入解释器这段代码:

import searchengine
pages=['http://en.wikipedia.org/wiki/Programming_language']
crawler = searchengine.crawler('searchindex.db')
crawler.crawl(pages)

获取消息:

Could not open http://en.wikipedia.org/wiki/Programming_language

或者有时会收到消息:

Indexing http://en.wikipedia.org/wiki/Programming_language
Could not parse page http://en.wikipedia.org/wiki/Programming_language

总结爬虫不会为页面编制索引。我做错了什么?

1 个答案:

答案 0 :(得分:1)

将大写字母W def separateWords(self,text)变为小写,在gettextonly(self,soup)中,将v==Null变为None。您还必须执行后面的步骤,如

>> crawler=searchengine.crawler('searchindex.db') 
>> crawler.createindextables()
>> crawler=searchengine.crawler('searchindex.db') 

首先,然后尝试运行page=['***']和其他步骤。