Dockerfile scrapy未知命令:爬网

时间:2018-08-03 02:53:04

标签: python docker scrapy dockerfile scrapy-spider

你好,我正在尝试使用Dockerfile运行我的抓spider蜘蛛的CMD。我已经将路径设置到包含scrapy scrapy.cfg文件夹和CMD scrapy爬网属性的文件夹,以启动蜘蛛

当我运行docker-compose up时,它返回错误

Scrapy 1.5.0 - no active project
web_1        |
web_1        | Unknown command: crawl
web_1        |
web_1        | Use "scrapy" to see available commands

这是我的文档文件

FROM ubuntu:18.04
FROM python:3.6-onbuild
RUN  apt-get update &&apt-get upgrade -y&& apt-get install python-pip -y
RUN pip install --upgrade pip
RUN pip install scrapy
ADD . /scrapy_estate/tutorial
WORKDIR /scrapy_estate/tutorial
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 80
CMD scrapy crawl estate

我的目录树

spider
├── CreateDB.sql
├── docker-compose.yml
├── Dockerfile
├── initdb
│   └── init.sql
├── init.sql
├── npm-debug.log
├── requirements.txt
└── scrapy_estate
    └── tutorial
        ├── scrapy.cfg
        └── tutorial
            ├── __init__.py
            ├── items.py
            ├── middlewares.py
            ├── pipelines.py
            ├── __pycache__
            │   ├── __init__.cpython-36.pyc
            │   ├── items.cpython-36.pyc
            │   ├── middlewares.cpython-36.pyc
            │   ├── pipelines.cpython-36.pyc
            │   └── settings.cpython-36.pyc
            ├── settings.py
            └── spiders
                ├── __init__.py
                ├── __pycache__
                │   ├── __init__.cpython-36.pyc
                │   └── real_estate_spider.cpython-36.pyc
                └── real_estate_spider.py

我把WORKDIR放错了还是我的CMD错误了? 任何帮助将不胜感激,谢谢您

编辑:

我的私密

2to3             __pycache__         docker-compose.yml.save  init.sql        
pip           pydoc3.6        python3.6          requirements.txt   tkconch
2to3-3.6         automat-visualize   easy_install             initdb          
pip3          pyhtmlizer      python3.6-config   scrapy             trial
CreateDB.sql     cftp                easy_install-3.6         items.py        
pip3.6        python          python3.6m         scrapy_estate  twist
Dockerfile       ckeygen             idle                     mailmail        
pipelines.py  python-config   python3.6m-config  settings.py        twistd
Dockerfile.save  conch               idle3                    middlewares.py  
pydoc         python3         pyvenv             spiders            wheel
__init__.py      docker-compose.yml  idle3.6                  npm-debug.log   
pydoc3        python3-config  pyvenv-3.6         splash

1 个答案:

答案 0 :(得分:0)

CMD ["scrapy", "crawl", "estate"]

如果使用CMD的外壳形式,则将在/bin/sh -c中执行:

如果要在没有外壳的情况下运行,则必须将命令表示为JSON数组,并提供可执行文件的完整路径。此数组形式是CMD的首选格式。任何其他参数必须在数组中分别表示为字符串:

FROM python:3.6-onbuild
RUN  apt-get update &&apt-get upgrade -y&& apt-get install python-pip -y
RUN pip install --upgrade pip
RUN pip install scrapy
ADD . /scrapy_estate/tutorial
WORKDIR /scrapy_estate/tutorial
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 80
CMD scrapy crawl estate