Scrapy总是从命令提示符

时间:2017-10-30 09:14:50

标签: scrapy scrapy-spider

我正在尝试在Windows 10上学习BashOnUbunty上的Scrapy。我使用genspider命令创建了一个蜘蛛(yelprest),然后通过创建蜘蛛文件直接创建了另一个蜘蛛(quotes_spider)(遵循官方教程{{3 }})。

第一只蜘蛛尚未经过测试,但我尝试用第二只蜘蛛进行教程,当我尝试跑步时,我收到的错误指向第一只蜘蛛。此外,当我尝试运行任何其他scrapy命令,如版本时,我得到与上面相同的错误。以下是错误:

(BashEnv) root > scrapy version
Traceback (most recent call last):
  File "/mnt/s/BashEnv/bin/scrapy", line 11, in <module>
    sys.exit(execute())
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 148, in execute
    cmd.crawler_process = CrawlerProcess(settings)
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 243, in __init__
    super(CrawlerProcess, self).__init__(settings)
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 134, in __init__
    self.spider_loader = _get_spider_loader(settings)
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 330, in _get_spider_loader
    return loader_cls.from_settings(settings.frozencopy())
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 61, in from_settings
    return cls(settings)
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 25, in __init__
    self._load_all_spiders()
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
    for module in walk_modules(name):
  File "/mnt/s/BashEnv/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
    submod = import_module(fullpath)
  File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/mnt/s/BashEnv/Scrapy/Scrapy/spiders/yelprest.py", line 14
    rules = (
    ^
IndentationError: unexpected indent
(BashEnv) root >

我不明白为什么我给出的任何命令都会得到同样的错误。

1 个答案:

答案 0 :(得分:1)

yelprest.py文件中存在一些错误(第14行或之前):它不是有效的Python。修复此错误,一切都会正常工作。确保您的文件正确缩进,不要混合空格和制表符。

修改

要确保错误在此文件中,请将其删除。如果一切都没有这个文件,错误必须在那里!

<强>更新

您的问题没有明确说明,但是您的评论是您的问题&#34;为什么Scrapy会为每个命令加载我的蜘蛛代码?&#34;。答案是:因为Scrapy是为了做到这一点。某些命令只能在项目中运行,如checkcrawl。某些命令可以在任何地方运行,例如startproject。但是在Scrapy项目中,任何命令都会加载所有代码。 Scrapy是这样做的。

例如,我有一个名为crawler的项目(我知道,非常具有描述性!):

$ cd ~
$ scrapy version
Scrapy 1.4.0
$ cd crawler/
$ scrapy version
2017-10-31 14:47:42 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: crawler)
2017-10-31 14:47:42 [scrapy.utils.log] INFO: Overridden settings: {...}
Scrapy 1.4.0