我在Scrapy教程中创建Scrapy蜘蛛时遇到了问题:
http://doc.scrapy.org/en/latest/intro/tutorial.html#our-first-spider
以下是我的spiders / dmoz_spider.py文件中的内容:
class DmozSpider(object):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
@classmethod
def from_crawler(cls, crawler):
spider = crawler.spiders
return cls(spider)
def parse(self, response):
filename = response.url.split("/")[-2]
open(filename, 'wb').write(response.body)
好消息是我很确定蜘蛛会被创造出来。坏消息是我收到了这个错误:
(scrapestat)unknownc8e0eb148153:tutorial christopherspears$ scrapy crawl dmoz
Traceback (most recent call last):
File "/Users/christopherspears/.virtualenvs/scrapestat/bin/scrapy", line 4, in <module>
execute()
File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/cmdline.py", line 150, in _run_command
cmd.run(args, opts)
File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/commands/crawl.py", line 48, in run
spider = crawler.spiders.create(spname, **opts.spargs)
File "/Users/christopherspears/.virtualenvs/scrapestat/lib/python2.7/site-packages/scrapy/spidermanager.py", line 44, in create
raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: dmoz'
不确定是什么问题。任何提示?
答案 0 :(得分:1)
DmozSpider应该继承自BaseSpider(或Spider,取决于你的scrapy版本)。因此,在代码中进行以下更改:
from scrapy.spider import BaseSpider
class DmozSpider(BaseSpider):
...
我自己尝试过,当spider类从对象继承引发KeyError时。