我似乎遇到了一个Scrapy蜘蛛部署的问题,这个问题导致了一些听力错误,尽管我还没有能够成功地使用任何以前的答案,或者因为它有所不同问题或修正案不够详细,我无法遵循。
我已经上传了一个项目,并且部署命令昨天正常运行。现在我再次玩弄它,当我运行scrapy deploy -l来查看项目列表时,我明白了:
Scrapy 0.24.4 - no active project
Unknown command: deploy
Use "scrapy" to see available commands
所以一个常见的修复似乎是说我需要使用命令:scrapyd重启Scrapyd。当我这样做时,我得到:
2014-09-17 01:58:47+0000 [-] Log opened.
2014-09-17 01:58:47+0000 [-] twistd 13.2.0 (/usr/bin/python 2.7.6) starting up.
2014-09-17 01:58:47+0000 [-] reactor class: twisted.internet.epollreactor.EPollReactor.
2014-09-17 01:58:47+0000 [-] Traceback (most recent call last):
2014-09-17 01:58:47+0000 [-] File "/usr/bin/scrapyd", line 8, in <module>
2014-09-17 01:58:47+0000 [-] run()
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/scripts/twistd.py", line 27, in run
2014-09-17 01:58:47+0000 [-] app.run(runApp, ServerOptions)
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 642, in run
2014-09-17 01:58:47+0000 [-] runApp(config)
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/scripts/twistd.py", line 23, in runApp
2014-09-17 01:58:47+0000 [-] _SomeApplicationRunner(config).run()
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 380, in run
2014-09-17 01:58:47+0000 [-] self.postApplication()
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/scripts/_twistd_unix.py", line 193, in postApplication
2014-09-17 01:58:47+0000 [-] self.startApplication(self.application)
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/scripts/_twistd_unix.py", line 381, in startApplication
2014-09-17 01:58:47+0000 [-] service.IService(application).privilegedStartService()
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/application/service.py", line 277, in privilegedStartService
2014-09-17 01:58:47+0000 [-] service.privilegedStartService()
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/application/internet.py", line 105, in privilegedStartService
2014-09-17 01:58:47+0000 [-] self._port = self._getPort()
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/application/internet.py", line 133, in _getPort
2014-09-17 01:58:47+0000 [-] 'listen%s' % (self.method,))(*self.args, **self.kwargs)
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 495, in listenTCP
2014-09-17 01:58:47+0000 [-] p.startListening()
2014-09-17 01:58:47+0000 [-] File "/usr/lib/python2.7/dist-packages/twisted/internet/tcp.py", line 980, in startListening
2014-09-17 01:58:47+0000 [-] raise CannotListenError(self.interface, self.port, le)
2014-09-17 01:58:47+0000 [-] twisted.internet.error.CannotListenError: Couldn't listen on 0.0.0.0:6800: [Errno 98] Address already in use.
根据该信息以及此处发布的其他一些问题,似乎是某种倾听错误,但我无法弄清楚应该使用哪种解决方案或在哪些地方进行调整。
编辑:
这是我重启Scrapyd后得到的结果:
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:6800 0.0.0.0:* LISTEN 956/python
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1004/sshd
tcp6 0 0 :::22 :::* LISTEN 1004/sshd
udp 0 0 0.0.0.0:14330 0.0.0.0:* 509/dhclient
udp 0 0 0.0.0.0:68 0.0.0.0:* 509/dhclient
udp6 0 0 :::3311 :::* 509/dhclient
编辑2:
编辑2
所以我追溯并再次启动我的本地项目目录,试图弄清楚这一切都出错了。当我尝试在本地列出它们时,我现在得到了什么:
Christophers-MacBook-Pro:shn Chris$ scrapy deploy -l
aws-target http://*********.compute-1.amazonaws.com:6800/
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 5, in <module>
pkg_resources.run_script('Scrapy==0.22.2', 'scrapy')
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 489, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 1207, in run_script
execfile(script_filename, namespace, namespace)
File "/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/EGG-INFO/scripts/scrapy", line 4, in <module>
execute()
File "/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/cmdline.py", line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/cmdline.py", line 150, in _run_command
cmd.run(args, opts)
File "/Library/Python/2.7/site-packages/Scrapy-0.22.2-py2.7.egg/scrapy/commands/deploy.py", line 76, in run
print("%-20s %s" % (name, target['url']))
KeyError: 'url'
编辑3:
这是配置文件...
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# http://doc.scrapy.org/en/latest/topics/scrapyd.html
[settings]
default = shn.settings
[deploy:local-target]
#url = http://localhost:6800/
project = shn
[deploy:aws-target]
url = http://********.compute-1.amazonaws.com:6800/
project = shn
对于它的价值,我现在可以使用curl选项再次运行它,并在aws:6800上保存日志文件和输出。虽然scrapy deploy命令仍然给我之前发布的错误,但是。
答案 0 :(得分:1)
听起来像scrapyd仍在运行,因为扭曲没有释放端口。你能确认使用netstat:
$ sudo netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:17123 0.0.0.0:* LISTEN 1048/python
tcp 0 0 0.0.0.0:6800 0.0.0.0:* LISTEN 1434/python
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 995/sshd
tcp6 0 0 :::22 :::* LISTEN 995/sshd
udp 0 0 127.0.0.1:8125 0.0.0.0:* 1047/python
udp 0 0 0.0.0.0:68 0.0.0.0:* 493/dhclient
udp 0 0 0.0.0.0:16150 0.0.0.0:* 493/dhclient
udp6 0 0 :::28687 :::* 493/dhclient
杀死报废:
$ sudo kill -INT $(cat /var/run/scrapyd.pid)
然后重启:
$ sudo service scrapyd start
然后cd进入项目目录,确保已在scrapy.cfg文件中定义了部署目标:
$ cd ~/takeovertheworld
vagrant@portia:~/takeovertheworld$ cat scrapy.cfg
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# http://doc.scrapy.org/en/latest/topics/scrapyd.html
[settings]
default = takeovertheworld.settings
[deploy:local-target]
url = http://localhost:6800/
project = takeovertheworld
[deploy:aws-target]
url = http://my-ec2-instance.amazonaws.com:6800/
project = takeovertheworld
并部署项目:
vagrant@portia:~/takeovertheworld$ scrapy deploy aws-target
Packing version 1410145736
Deploying to project "takeovertheworld" in http://ec2-xx-xxx-xx-xxx.compute-1.amazonaws.com:6800/addversion.json
Server response (200):
{"status": "ok", "project": "takeovertheworld", "version": "1410145736", "spiders": 1}
编辑scrapy.cfg文件。如果您不需要,请从local-target中的网址行中删除#或完全删除local-target。
答案 1 :(得分:0)
尝试在您的amazon ec2服务器上停止并重新启动scrapyd服务。 确保您的配置文件具有正确的部署信息
[deploy:deploye_name]
url = http://ip_Address:port_number/
project = your_project_name
转到存在config.cfg的项目目录,并检查可用的部署
scrapy deploy -l