我正在尝试使用他们提供的规则将Scrapy Spider简单地部署到ScrapingHub。出于某种原因,它专门搜索Python 3.6目录,它应该能够搜索任何3.x Python目录。我的蜘蛛是用Python 3.5编写的,这是一个问题。 Scrapinghub说,识别“scrapy:1.4-py3”将适用于3.x Python集,但这显然不是真的。
另外,由于某种原因,它似乎无法在项目中找到我的蜘蛛。这与3.6目录的问题有关。
最后,我已经安装了需求文件中所需的所有内容。
C:\Users\Desktop\Empery Code\YahooScrape>shub deploy
Packing version 1.0
Deploying to Scrapy Cloud project "205357"
Deploy log last 30 lines:
Deploy log location: C:\Users\AppData\Local\Temp\shub_deploy_of5_m4
qg.log
Error: Deploy failed: b'{"status": "error", "message": "Internal build error"}'
_run(args, settings)
File "/usr/local/lib/python3.6/site-packages/sh_scrapy/crawl.py", line 103, in
_run
_run_scrapy(args, settings)
File "/usr/local/lib/python3.6/site-packages/sh_scrapy/crawl.py", line 111, in
_run_scrapy
execute(settings=settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/cmdline.py", line 148, in
execute
cmd.crawler_process = CrawlerProcess(settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/crawler.py", line 243, in
__init__
super(CrawlerProcess, self).__init__(settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/crawler.py", line 134, in
__init__
self.spider_loader = _get_spider_loader(settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/crawler.py", line 330, in
_get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/usr/local/lib/python3.6/site-packages/scrapy/spiderloader.py", line 61,
in from_settings
return cls(settings)
File "/usr/local/lib/python3.6/site-packages/scrapy/spiderloader.py", line 25,
in __init__
self._load_all_spiders()
File "/usr/local/lib/python3.6/site-packages/scrapy/spiderloader.py", line 47,
in _load_all_spiders
for module in walk_modules(name):
File "/usr/local/lib/python3.6/site-packages/scrapy/utils/misc.py", line 63, i
n walk_modules
mod = import_module(path)
File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_mod
ule
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 978, in _gcd_import
File "<frozen importlib._bootstrap>", line 961, in _find_and_load
File "<frozen importlib._bootstrap>", line 948, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'YahooScrape.spiders'
{"message": "list-spiders exit code: 1", "details": null, "error": "build_error"
}
{"status": "error", "message": "Internal build error"}
C:\Users\Desktop\Empery Code\YahooScrape>\
Scrapy.cfg文件:
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.org/en/latest/deploy.html
[settings]
default = YahooScrape.settings
[deploy]
#url = http://localhost:6800/
project = YahooScrape
Scrapinghub.yml代码:
project: -----
requirements:
file: requirements.txt
stacks:
default: scrapy:1.4-py3
答案 0 :(得分:2)
确保您的目录树如下所示:
$ tree
.
├── YahooScrape
│ ├── __init__.py
│ ├── items.py
│ ├── middlewares.py
│ ├── pipelines.py
│ ├── settings.py
│ └── spiders
│ ├── yahoo.py
│ └── __init__.py
├── requirements.txt
├── scrapinghub.yml
├── scrapy.cfg
└── setup.py
特别注意YahooScrape/spiders/
。它应该包含__init__.py
文件(空文件很好)和不同的蜘蛛,通常是单独的.py
文件。
否则YahooScrape.spiders
不能被理解为Python模块,因此"ModuleNotFoundError: No module named 'YahooScrape.spiders'"
消息。