无法运行Scrapy程序

时间:2015-03-26 10:18:14

标签: python module web-crawler scrapy

我从以下链接学习如何使用Scrapy:

http://doc.scrapy.org/en/master/intro/tutorial.html

当我尝试运行在Crawling(scrapy crawl dmoz)部分中编写的代码时,我收到以下错误:

AttributeError: 'module' object has no attribute 'Spider

但是,我将“蜘蛛”改为“蜘蛛”,我只得到一个新的错误:

TypeError: Error when calling the metaclass bases
module.__init__() takes at most 2 arguments (3 given)

我很困惑,问题是什么?任何帮助将非常感激。谢谢。顺便说一下,我正在使用Windows。

编辑(来源补充):

首先,我通过转到目录并使用cmd运行以下命令来创建一个使用Scrapy的项目,如下所示:

cd #DIRECTORY PATH#

scrapy startproject tutorial

这将在给定目录中创建名为tutorial的文件夹。教程文件夹包含:

教程/     scrapy.cfg     教程/         的初始化的.py         items.py         pipelines.py         settings.py         蜘蛛/             的初始化的.py             ...

然后我定义了我的项目:

import scrapy

class DmozItem(scrapy.Item):
    title = scrapy.Field()
    link = scrapy.Field()
    desc = scrapy.Field()

之后,我创造了蜘蛛:

导入scrapy

class DmozSpider(scrapy.Spider):
    name = "dmoz"
    allowed_domains = ["dmoz.org"]
    start_urls = [
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
    ]

    def parse(self, response):
        filename = response.url.split("/")[-2]
        with open(filename, 'wb') as f:
            f.write(response.body)

之后,在运行代码时,会显示错误。我使用的是Windows 7 64位以及Python 2.7 32位。

编辑2:

我尝试卸载并安装另一个Scrapy版本,但它没有用。这是日志:

C:\Users\Novin Pendar\Desktop\FS\tutorial>scrapy crawl dmoz
2015-03-26 17:48:29+0430 [scrapy] INFO: Scrapy 0.16.5 started (bot: tutorial)
2015-03-26 17:48:29+0430 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetCon
sole, CloseSpider, WebService, CoreStats, SpiderState
C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\__init__.pyc
Traceback (most recent call last):
  File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "C:\Python27\lib\runpy.py", line 72, in _run_code
    exec code in run_globals
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\cmdline.py"
, line 156, in <module>
    execute()
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\cmdline.py"
, line 131, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\cmdline.py"
, line 76, in _run_print_help
    func(*a, **kw)
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\cmdline.py"
, line 138, in _run_command
    cmd.run(args, opts)
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\commands\cr
awl.py", line 43, in run
    spider = self.crawler.spiders.create(spname, **opts.spargs)
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\command.py"
, line 33, in crawler
    self._crawler.configure()
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\crawler.py"
, line 40, in configure
    self.spiders = spman_cls.from_crawler(self)
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\spidermanag
er.py", line 35, in from_crawler
    sm = cls.from_settings(crawler.settings)
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\spidermanag
er.py", line 31, in from_settings
    return cls(settings.getlist('SPIDER_MODULES'))
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\spidermanag
er.py", line 22, in __init__
    for module in walk_modules(name):
  File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\utils\misc.
py", line 65, in walk_modules
    submod = __import__(fullpath, {}, {}, [''])
  File "tutorial\spiders\dmoz_spider.py", line 3, in <module>
    class DmozSpider(scrapy.Spider):
AttributeError: 'module' object has no attribute 'Spider'

编辑3:

问题解决了。我下载了最新版本的Scrapy(0.24)并安装完毕。一切都很有效。只是想对那些我曾经遇到过同样问题的人说,这样他们就可以节省很多时间。感谢。

2 个答案:

答案 0 :(得分:1)

如果您的安装正确。请尝试此

检查工作文件夹中的所有scrapy.pyscrapy.pyc。如果存在,请将其重命名。不要将Spider更改为spider

答案 1 :(得分:0)

使用此定义: class DmozSpider(scrapy.spider.BaseSpider):