在debian上安装scrapy

时间:2014-12-07 21:23:23

标签: python linux debian scrapy

我遇到了一个问题,scrapy工作正常,直到我卸载并重新安装。 由于当时在debian上没有0.24,我将ubuntu repo添加到我的/etc/pat/sources.list.d并使用apt-get安装它,如下所述:http://doc.scrapy.org/en/0.24/topics/ubuntu.html

今天看到它在debian上可用我apt-get删除scrapy-0.24(从ubuntu repo安装的那个)并且做了apt-get install python-scrapy

现在,当我执行scrapy shell www.google.fr时输出:

2014-12-07 22:08:26+0100 [scrapy] INFO: Scrapy 0.24.2 started (bot: scrapybot)
2014-12-07 22:08:26+0100 [scrapy] INFO: Optional features available: ssl, http11, boto, django
2014-12-07 22:08:26+0100 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0}
2014-12-07 22:08:26+0100 [scrapy] INFO: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
2014-12-07 22:08:26+0100 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2014-12-07 22:08:26+0100 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2014-12-07 22:08:26+0100 [scrapy] INFO: Enabled item pipelines: 
2014-12-07 22:08:26+0100 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2014-12-07 22:08:26+0100 [scrapy] DEBUG: Web service listening on 127.0.0.1:6080
2014-12-07 22:08:26+0100 [default] INFO: Spider opened
2014-12-07 22:08:26+0100 [default] DEBUG: Retrying <GET file:///home/lotso/www.google.fr> (failed 1 times): [Errno 2] No such file or directory: '/home/lotso/www.google.fr'
2014-12-07 22:08:26+0100 [default] DEBUG: Retrying <GET file:///home/lotso/www.google.fr> (failed 2 times): [Errno 2] No such file or directory: '/home/lotso/www.google.fr'
2014-12-07 22:08:26+0100 [default] DEBUG: Gave up retrying <GET file:///home/lotso/www.google.fr> (failed 3 times): [Errno 2] No such file or directory: '/home/lotso/www.google.fr'
Traceback (most recent call last):
  File "/usr/bin/scrapy", line 4, in <module>
execute()
  File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
  File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help
func(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command
cmd.run(args, opts)
  File "/usr/lib/python2.7/dist-packages/scrapy/commands/shell.py", line 50, in run
shell.start(url=url, spider=spider)
  File "/usr/lib/python2.7/dist-packages/scrapy/shell.py", line 45, in start
self.fetch(url, spider)
  File "/usr/lib/python2.7/dist-packages/scrapy/shell.py", line 90, in fetch
reactor, self._schedule, request, spider)
  File "/usr/lib/python2.7/dist-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
result.raiseException()
  File "<string>", line 2, in raiseException
IOError: [Errno 2] No such file or directory: '/home/lotso/www.google.fr'

你可以想象➜ ~ pwd /home/lotso 如果我更改目录,它会附加我所在的目录 我尝试用purge卸载python-scrapy,然后通过pip安装它,我得到了同样的问题

我现在不知所措,我怀疑某个环境变量但是我自己没有解决它......

2 个答案:

答案 0 :(得分:2)

此方法适用于scrapy 1.0.3和debian 8.2

  1. 安装依赖项以设置
    SELECT tr.trans_date, tr.amount, tr.ledger_id, l.ledger_name, tr.trans_type FROM tbl_transaction tr LEFT JOIN tbl_ledgers l ON l.ledger_id = tr.ledger_id WHERE trans_date BETWEEN '2004-01-01' AND '2004-01-31';
  2. 从此网站下载scrapy:http://scrapy.org/(例如:选择tarball)
  3. 解压缩和设置
    sudo apt-get install python-twisted python-libxml2 python- libxml2-dbg python-openssl python-simplejson

答案 1 :(得分:2)

如何在Debian 8(x86)上安装带有Python 2.7的Scrapy 1.3

重启你的机器,从root运行(或使用sudo)。

apt-get update 
apt-get upgrade
apt-get install virtualenv

有关虚拟环境的基本信息:https://virtualenv.pypa.io/en/stable/userguide/

virtual ENV
cd ENV
source bin/activate

虚拟环境被激活,...(命令“停用”只是停用它)

apt-get install gcc
apt-get install python-pip
apt-get install cython
apt-get install python-dev python-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev
pip install pip --upgrade
pip install scrapy
pip install scrapy --upgrade

这对我有用,我在空装置上应用它。