Scrapy-deploy to Scrapyd不会安装setup.py中指向的要求

时间:2017-09-24 12:09:17

标签: python scrapy setuptools scrapy-spider scrapyd

我有一个用Scrapy编写的项目。这个蜘蛛在setup.py中有很多要求。这是一个简单的示例。我跑

scrapyd-deploy

并具有以下输出

Packing version 1506254163
Deploying to project "quotesbot" in http://localhost:6800/addversion.json
Server response (200):
......................... [CUTTED TRACEBACK] ...........
\"/private/var/folders/xp/c949vlsd14q8xm__dv0dx8jh0000gn/T/quotesbot-1506254163-e50lmcfx.egg/quotesbot/spiders/toscrape-css.py\",
 line 4, in <module>\n
ModuleNotFoundError: No module named 'sqlalchemy'\n"}

BUT

setup.py in the same directory:

# Automatically created by: scrapyd-deploy

from setuptools import setup, find_packages

setup(
    name         = 'quotesbot',
    version      = '1.0',
    packages     = find_packages(),
    entry_points = {'scrapy': ['settings = quotesbot.settings']},
    install_requires=[
        'scrapy-splash',
         [ SOME REQUIREMENTS]
        'sqlalchemy'
    ],
)

1 个答案:

答案 0 :(得分:3)

我检查了scrapyd源代码,但它没有运行您项目的setup.py。它只是解包包含依赖信息的egg,而不是依赖本身。下面是addversion api的代码

class AddVersion(WsResource):

    def render_POST(self, txrequest):
        project = txrequest.args[b'project'][0].decode('utf-8')
        version = txrequest.args[b'version'][0].decode('utf-8')
        eggf = BytesIO(txrequest.args[b'egg'][0])
        self.root.eggstorage.put(eggf, project, version)
        spiders = get_spider_list(project, version=version)
        self.root.update_projects()
        UtilsCache.invalid_cache(project)
        return {"node_name": self.root.nodename, "status": "ok", "project": project, "version": version, \
            "spiders": len(spiders)}

基本上只提取蛋的self.root.eggstorage.put(eggf, project, version)之后,它直接运行spiders = get_spider_list(project, version=version),因此没有设置完成。

因此,您的鸡蛋需要包含所有依赖项,这意味着您不会使用scrapyd-deploy构建鸡蛋。我找不到很多文件来看看是否可能

所以你看到的是因为srapyd缺乏实现。您应该在http://github.com/scrapy/scrapyd/

打开错误或增强请求