我关注了the official guide,但收到了以下错误消息:
以下软件包具有未满足的依赖关系:
scrapy:取决于:python-support(> = 0.90.0)但它不可安装 建议:python-setuptools但不会安装 E:无法纠正问题,你已经破了包裹。
然后我尝试了sudo apt-get python-support
,但发现ubuntu 16.04已被删除python-support
。
最后,我尝试安装python-setuptools
,但似乎只会安装python2。
The following additional packages will be installed:
libpython-stdlib libpython2.7-minimal libpython2.7-stdlib python
python-minimal python-pkg-resources python2.7 python2.7-minimal
Suggested packages:
python-doc python-tk python-setuptools-doc python2.7-doc binutils
binfmt-support
The following NEW packages will be installed:
libpython-stdlib libpython2.7-minimal libpython2.7-stdlib python
python-minimal python-pkg-resources python-setuptools python2.7
python2.7-minimal
在Ubuntu 16.04上,如何在Python 3环境中使用Scrapy
?感谢。
答案 0 :(得分:2)
你应该善于:
apt-get install -y \
python3 \
python-dev \
python3-dev
# for cryptography
apt-get install -y \
build-essential \
libssl-dev \
libffi-dev
# for lxml
apt-get install -y \
libxml2-dev \
libxslt-dev
# install pip
apt-get install -y python-pip
这是一个示例Dockerfile,用于在Ubuntu 16.04 / Xenial上测试在Python 3上安装scrapy:
$ cat Dockerfile
FROM ubuntu:xenial
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update
# Install Python3 and dev headers
RUN apt-get install -y \
python3 \
python-dev \
python3-dev
# Install cryptography
RUN apt-get install -y \
build-essential \
libssl-dev \
libffi-dev
# install lxml
RUN apt-get install -y \
libxml2-dev \
libxslt-dev
# install pip
RUN apt-get install -y python-pip
RUN useradd --create-home --shell /bin/bash scrapyuser
USER scrapyuser
WORKDIR /home/scrapyuser
然后,在构建Docker镜像并为其运行容器之后:
$ sudo docker build -t redapple/scrapy-ubuntu-xenial .
$ sudo docker run -t -i redapple/scrapy-ubuntu-xenial
您可以运行pip install scrapy
下面我使用virtualenvwrapper
创建Python 3 virtualenv:
scrapyuser@88cc645ac499:~$ pip install --user virtualenvwrapper
Collecting virtualenvwrapper
Downloading virtualenvwrapper-4.7.1-py2.py3-none-any.whl
Collecting virtualenv-clone (from virtualenvwrapper)
Downloading virtualenv-clone-0.2.6.tar.gz
Collecting stevedore (from virtualenvwrapper)
Downloading stevedore-1.14.0-py2.py3-none-any.whl
Collecting virtualenv (from virtualenvwrapper)
Downloading virtualenv-15.0.2-py2.py3-none-any.whl (1.8MB)
100% |################################| 1.8MB 320kB/s
Collecting pbr>=1.6 (from stevedore->virtualenvwrapper)
Downloading pbr-1.10.0-py2.py3-none-any.whl (96kB)
100% |################################| 102kB 1.5MB/s
Collecting six>=1.9.0 (from stevedore->virtualenvwrapper)
Downloading six-1.10.0-py2.py3-none-any.whl
Building wheels for collected packages: virtualenv-clone
Running setup.py bdist_wheel for virtualenv-clone ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/24/51/ef/93120d304d240b4b6c2066454250a1626e04f73d34417b956d
Successfully built virtualenv-clone
Installing collected packages: virtualenv-clone, pbr, six, stevedore, virtualenv, virtualenvwrapper
Successfully installed pbr six stevedore virtualenv virtualenv-clone virtualenvwrapper
You are using pip version 8.1.1, however version 8.1.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
scrapyuser@88cc645ac499:~$ source ~/.local/bin/virtualenvwrapper.sh
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/premkproject
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postmkproject
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/initialize
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/premkvirtualenv
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postmkvirtualenv
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/prermvirtualenv
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postrmvirtualenv
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/predeactivate
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postdeactivate
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/preactivate
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/postactivate
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/get_env_details
scrapyuser@88cc645ac499:~$ export PATH=$PATH:/home/scrapyuser/.local/bin
scrapyuser@88cc645ac499:~$ mkvirtualenv --python=/usr/bin/python3 scrapy11.py3
Running virtualenv with interpreter /usr/bin/python3
Using base prefix '/usr'
New python executable in /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/python3
Also creating executable in /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/python
Installing setuptools, pip, wheel...done.
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/predeactivate
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/postdeactivate
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/preactivate
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/postactivate
virtualenvwrapper.user_scripts creating /home/scrapyuser/.virtualenvs/scrapy11.py3/bin/get_env_details
安装scrapy 1.1是pip install scrapy
(scrapy11.py3) scrapyuser@88cc645ac499:~$ pip install scrapy
Collecting scrapy
Downloading Scrapy-1.1.0-py2.py3-none-any.whl (294kB)
100% |################################| 296kB 1.0MB/s
Collecting PyDispatcher>=2.0.5 (from scrapy)
Downloading PyDispatcher-2.0.5.tar.gz
Collecting pyOpenSSL (from scrapy)
Downloading pyOpenSSL-16.0.0-py2.py3-none-any.whl (45kB)
100% |################################| 51kB 1.8MB/s
Collecting lxml (from scrapy)
Downloading lxml-3.6.0.tar.gz (3.7MB)
100% |################################| 3.7MB 312kB/s
Collecting parsel>=0.9.3 (from scrapy)
Downloading parsel-1.0.2-py2.py3-none-any.whl
Collecting six>=1.5.2 (from scrapy)
Using cached six-1.10.0-py2.py3-none-any.whl
Collecting Twisted>=10.0.0 (from scrapy)
Downloading Twisted-16.2.0.tar.bz2 (2.9MB)
100% |################################| 2.9MB 307kB/s
Collecting queuelib (from scrapy)
Downloading queuelib-1.4.2-py2.py3-none-any.whl
Collecting cssselect>=0.9 (from scrapy)
Downloading cssselect-0.9.1.tar.gz
Collecting w3lib>=1.14.2 (from scrapy)
Downloading w3lib-1.14.2-py2.py3-none-any.whl
Collecting service-identity (from scrapy)
Downloading service_identity-16.0.0-py2.py3-none-any.whl
Collecting cryptography>=1.3 (from pyOpenSSL->scrapy)
Downloading cryptography-1.4.tar.gz (399kB)
100% |################################| 409kB 1.1MB/s
Collecting zope.interface>=4.0.2 (from Twisted>=10.0.0->scrapy)
Downloading zope.interface-4.1.3.tar.gz (141kB)
100% |################################| 143kB 1.3MB/s
Collecting attrs (from service-identity->scrapy)
Downloading attrs-16.0.0-py2.py3-none-any.whl
Collecting pyasn1 (from service-identity->scrapy)
Downloading pyasn1-0.1.9-py2.py3-none-any.whl
Collecting pyasn1-modules (from service-identity->scrapy)
Downloading pyasn1_modules-0.0.8-py2.py3-none-any.whl
Collecting idna>=2.0 (from cryptography>=1.3->pyOpenSSL->scrapy)
Downloading idna-2.1-py2.py3-none-any.whl (54kB)
100% |################################| 61kB 2.0MB/s
Requirement already satisfied (use --upgrade to upgrade): setuptools>=11.3 in ./.virtualenvs/scrapy11.py3/lib/python3.5/site-packages (from cryptography>=1.3->pyOpenSSL->scrapy)
Collecting cffi>=1.4.1 (from cryptography>=1.3->pyOpenSSL->scrapy)
Downloading cffi-1.6.0.tar.gz (397kB)
100% |################################| 399kB 1.1MB/s
Collecting pycparser (from cffi>=1.4.1->cryptography>=1.3->pyOpenSSL->scrapy)
Downloading pycparser-2.14.tar.gz (223kB)
100% |################################| 225kB 1.2MB/s
Building wheels for collected packages: PyDispatcher, lxml, Twisted, cssselect, cryptography, zope.interface, cffi, pycparser
Running setup.py bdist_wheel for PyDispatcher ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/86/02/a1/5857c77600a28813aaf0f66d4e4568f50c9f133277a4122411
Running setup.py bdist_wheel for lxml ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/6c/eb/a1/e4ff54c99630e3cc6ec659287c4fd88345cd78199923544412
Running setup.py bdist_wheel for Twisted ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/fe/9d/3f/9f7b1c768889796c01929abb7cdfa2a9cdd32bae64eb7aa239
Running setup.py bdist_wheel for cssselect ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/1b/41/70/480fa9516ccc4853a474faf7a9fb3638338fc99a9255456dd0
Running setup.py bdist_wheel for cryptography ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/f6/6c/21/11ec069285a52d7fa8c735be5fc2edfb8b24012c0f78f93d20
Running setup.py bdist_wheel for zope.interface ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/52/04/ad/12c971c57ca6ee5e6d77019c7a1b93105b1460d8c2db6e4ef1
Running setup.py bdist_wheel for cffi ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/8f/00/29/553c1b1db38bbeec3fec428ae4e400cd8349ecd99fe86edea1
Running setup.py bdist_wheel for pycparser ... done
Stored in directory: /home/scrapyuser/.cache/pip/wheels/9b/f4/2e/d03e949a551719a1ffcb659f2c63d8444f4df12e994ce52112
Successfully built PyDispatcher lxml Twisted cssselect cryptography zope.interface cffi pycparser
Installing collected packages: PyDispatcher, idna, pyasn1, six, pycparser, cffi, cryptography, pyOpenSSL, lxml, w3lib, cssselect, parsel, zope.interface, Twisted, queuelib, attrs, pyasn1-modules, service-identity, scrapy
Successfully installed PyDispatcher-2.0.5 Twisted-16.2.0 attrs-16.0.0 cffi-1.6.0 cryptography-1.4 cssselect-0.9.1 idna-2.1 lxml-3.6.0 parsel-1.0.2 pyOpenSSL-16.0.0 pyasn1-0.1.9 pyasn1-modules-0.0.8 pycparser-2.14 queuelib-1.4.2 scrapy-1.1.0 service-identity-16.0.0 six-1.10.0 w3lib-1.14.2 zope.interface-4.1.3
最后测试示例项目:
(scrapy11.py3) scrapyuser@88cc645ac499:~$ scrapy startproject tutorial
New Scrapy project 'tutorial', using template directory '/home/scrapyuser/.virtualenvs/scrapy11.py3/lib/python3.5/site-packages/scrapy/templates/project', created in:
/home/scrapyuser/tutorial
You can start your first spider with:
cd tutorial
scrapy genspider example example.com
(scrapy11.py3) scrapyuser@88cc645ac499:~$ cd tutorial
(scrapy11.py3) scrapyuser@88cc645ac499:~/tutorial$ scrapy genspider example example.com
Created spider 'example' using template 'basic' in module:
tutorial.spiders.example
(scrapy11.py3) scrapyuser@88cc645ac499:~/tutorial$ cat tutorial/spiders/example.py
# -*- coding: utf-8 -*-
import scrapy
class ExampleSpider(scrapy.Spider):
name = "example"
allowed_domains = ["example.com"]
start_urls = (
'http://www.example.com/',
)
def parse(self, response):
pass
(scrapy11.py3) scrapyuser@88cc645ac499:~/tutorial$ scrapy crawl example
2016-06-07 11:08:27 [scrapy] INFO: Scrapy 1.1.0 started (bot: tutorial)
2016-06-07 11:08:27 [scrapy] INFO: Overridden settings: {'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial', 'ROBOTSTXT_OBEY': True, 'NEWSPIDER_MODULE': 'tutorial.spiders'}
2016-06-07 11:08:27 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats', 'scrapy.extensions.corestats.CoreStats']
2016-06-07 11:08:27 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2016-06-07 11:08:27 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2016-06-07 11:08:27 [scrapy] INFO: Enabled item pipelines:
[]
2016-06-07 11:08:27 [scrapy] INFO: Spider opened
2016-06-07 11:08:28 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-06-07 11:08:28 [scrapy] DEBUG: Crawled (404) <GET http://www.example.com/robots.txt> (referer: None)
2016-06-07 11:08:28 [scrapy] DEBUG: Crawled (200) <GET http://www.example.com/> (referer: None)
2016-06-07 11:08:28 [scrapy] INFO: Closing spider (finished)
2016-06-07 11:08:28 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 436,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
'downloader/response_bytes': 1921,
'downloader/response_count': 2,
'downloader/response_status_count/200': 1,
'downloader/response_status_count/404': 1,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2016, 6, 7, 11, 8, 28, 614605),
'log_count/DEBUG': 2,
'log_count/INFO': 7,
'response_received_count': 2,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2016, 6, 7, 11, 8, 28, 24624)}
2016-06-07 11:08:28 [scrapy] INFO: Spider closed (finished)
(scrapy11.py3) scrapyuser@88cc645ac499:~/tutorial$