我遇到了一个问题,scrapy工作正常,直到我卸载并重新安装。 由于当时在debian上没有0.24,我将ubuntu repo添加到我的/etc/pat/sources.list.d并使用apt-get安装它,如下所述:http://doc.scrapy.org/en/0.24/topics/ubuntu.html
今天看到它在debian上可用我apt-get删除scrapy-0.24(从ubuntu repo安装的那个)并且做了apt-get install python-scrapy
现在,当我执行scrapy shell www.google.fr
时输出:
2014-12-07 22:08:26+0100 [scrapy] INFO: Scrapy 0.24.2 started (bot: scrapybot)
2014-12-07 22:08:26+0100 [scrapy] INFO: Optional features available: ssl, http11, boto, django
2014-12-07 22:08:26+0100 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0}
2014-12-07 22:08:26+0100 [scrapy] INFO: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
2014-12-07 22:08:26+0100 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2014-12-07 22:08:26+0100 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2014-12-07 22:08:26+0100 [scrapy] INFO: Enabled item pipelines:
2014-12-07 22:08:26+0100 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2014-12-07 22:08:26+0100 [scrapy] DEBUG: Web service listening on 127.0.0.1:6080
2014-12-07 22:08:26+0100 [default] INFO: Spider opened
2014-12-07 22:08:26+0100 [default] DEBUG: Retrying <GET file:///home/lotso/www.google.fr> (failed 1 times): [Errno 2] No such file or directory: '/home/lotso/www.google.fr'
2014-12-07 22:08:26+0100 [default] DEBUG: Retrying <GET file:///home/lotso/www.google.fr> (failed 2 times): [Errno 2] No such file or directory: '/home/lotso/www.google.fr'
2014-12-07 22:08:26+0100 [default] DEBUG: Gave up retrying <GET file:///home/lotso/www.google.fr> (failed 3 times): [Errno 2] No such file or directory: '/home/lotso/www.google.fr'
Traceback (most recent call last):
File "/usr/bin/scrapy", line 4, in <module>
execute()
File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command
cmd.run(args, opts)
File "/usr/lib/python2.7/dist-packages/scrapy/commands/shell.py", line 50, in run
shell.start(url=url, spider=spider)
File "/usr/lib/python2.7/dist-packages/scrapy/shell.py", line 45, in start
self.fetch(url, spider)
File "/usr/lib/python2.7/dist-packages/scrapy/shell.py", line 90, in fetch
reactor, self._schedule, request, spider)
File "/usr/lib/python2.7/dist-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
result.raiseException()
File "<string>", line 2, in raiseException
IOError: [Errno 2] No such file or directory: '/home/lotso/www.google.fr'
你可以想象➜ ~ pwd
/home/lotso
如果我更改目录,它会附加我所在的目录
我尝试用purge卸载python-scrapy,然后通过pip安装它,我得到了同样的问题
我现在不知所措,我怀疑某个环境变量但是我自己没有解决它......
答案 0 :(得分:2)
此方法适用于scrapy 1.0.3和debian 8.2
SELECT tr.trans_date, tr.amount, tr.ledger_id, l.ledger_name, tr.trans_type
FROM tbl_transaction tr LEFT JOIN tbl_ledgers l
ON l.ledger_id = tr.ledger_id WHERE trans_date BETWEEN '2004-01-01' AND '2004-01-31';
sudo apt-get install python-twisted python-libxml2 python- libxml2-dbg python-openssl python-simplejson
答案 1 :(得分:2)
重启你的机器,从root运行(或使用sudo)。
apt-get update
apt-get upgrade
apt-get install virtualenv
有关虚拟环境的基本信息:https://virtualenv.pypa.io/en/stable/userguide/
virtual ENV
cd ENV
source bin/activate
虚拟环境被激活,...(命令“停用”只是停用它)
apt-get install gcc
apt-get install python-pip
apt-get install cython
apt-get install python-dev python-pip libxml2-dev libxslt1-dev zlib1g-dev libffi-dev libssl-dev
pip install pip --upgrade
pip install scrapy
pip install scrapy --upgrade
这对我有用,我在空装置上应用它。