全部,
我按照以下步骤从scrapy.org更新了默认系统包并安装了scrapy,这是一个用于构建蜘蛛的开源框架:http://doc.scrapy.org/en/1.1/intro/install.html
xcode-select --install
命令
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
echo "export PATH=/usr/local/bin:/usr/local/sbin:$PATH" >> ~/.bashrc
source ~/.bashrc
brew install python
brew update; brew upgrade python
pip install Scrapy
我希望对上述命令非常清楚,尝试更新和安装软件包。我继续按照指示创建项目,定义项目并创建我的第一个蜘蛛。
最后当我运行命令scrapy crawl dmoz
时,我收到以下错误消息
抓取命令
Romans-MBP:tutorial Roman$ scrapy crawl dmoz
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/usr/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 141, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 238, in __init__
super(CrawlerProcess, self).__init__(settings)
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 129, in __init__
self.spider_loader = _get_spider_loader(settings)
File "/usr/local/lib/python2.7/site-packages/scrapy/crawler.py", line 325, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/usr/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 33, in from_settings
return cls(settings)
File "/usr/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 20, in __init__
self._load_all_spiders()
File "/usr/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 28, in _load_all_spiders
for module in walk_modules(name):
File "/usr/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 63, in walk_modules
mod = import_module(path)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named spiders
Romans-MBP:tutorial Roman$
答案 0 :(得分:0)
Check in scrapy/tutorial/tutorial/spiders/[your_spider].py
the name of your spider, that should be run with scrapy crawl
command. In the example below, the name is dmozdirectory
and the run command is scrapy crawl dmozdirectory
Example:
class DmozSpider(scrapy.Spider):
name = "dmozdirectory"
allowed_domains = ["dmoz.org"]
Also, you should be in the root directory of your project when running that command, in scrapy/tutorial/