我正在尝试使用scrapyd部署scrapy项目。我可以通过使用
正常运行我的项目cd /var/www/api/scrapy/dirbot
scrapy crawl dmoz
这是我一步一步:
1 /我跑
scrapy version -v
>> Scrapy : 0.16.3
lxml : 3.0.2.0
libxml2 : 2.7.8
Twisted : 12.2.0
Python : 2.7.3 (default, Aug 1 2012, 05:14:39) - [GCC 4.6.3]
Platform: Linux-3.2.0-31-virtual-x86_64-with-Ubuntu-12.04-precise
2 /使用
安装scrapydaptitude install scrapyd-0.16
3 /我在/ var / www / api / scrapy / dirbot(http://domain.com/api/scrapy/dirbot)进行项目扫描。我编辑scrapy.cfg
[settings]
default = dirbot.settings
[deploy:scrapyd2]
url = http://domain.com/api/scrapy/dirbot/
username = vu
password = hoang
4 / I使用deploy命令进行测试
scrapy deploy -l
>> scrapyd2 http://domain.com/api/scrapy/dirbot/
5 /但是当我使用命令
时scrapy deploy -L scrapyd2
>> /usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/settings/deprecated.py:23: ScrapyDeprecationWarning: You are using the following settings which are deprecated or obsolete (ask scrapy-users@googlegroups.com for alternatives):
BOT_VERSION: no longer used (user agent defaults to Scrapy now)
warnings.warn(msg, ScrapyDeprecationWarning)
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 5, in <module>
pkg_resources.run_script('Scrapy==0.16.3', 'scrapy')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 499, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1235, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/EGG-INFO/scripts/scrapy", line 4, in <module>
execute()
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/cmdline.py", line 131, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/cmdline.py", line 76, in _run_print_help
func(*a, **kw)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/cmdline.py", line 138, in _run_command
cmd.run(args, opts)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/commands/deploy.py", line 76, in run
f = urllib2.urlopen(req)
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 406, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 444, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
和
scrapy deploy scrapyd2 -p project
>> /usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/settings/d eprecated.py:23: ScrapyDeprecationWarning: You are using the following settings which are deprecated or obsolete (ask scrapy-users@googlegroups.com for alternat ives):
BOT_VERSION: no longer used (user agent defaults to Scrapy now)
warnings.warn(msg, ScrapyDeprecationWarning)
Building egg of project-1358597244
'build/scripts-2.7' does not exist -- can't clean it
zip_safe flag not set; analyzing archive contents...
Deploying project-1358597244 to http://domain.com/api/scrapy/dirbot/addversio n.json
Deploy failed (404):
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /api/scrapy/dirbot/addversion.json was not found on this se rver.</p>
<hr>
<address>Apache/2.2.22 (Ubuntu) Server at domain.com Port 80</address>
</body></html>
*我不明白什么是python egg。你能举个例子吗?我不知道我有没有。也许是那个文件/var/www/api/scrapy/dirbot/setup.py?
from setuptools import setup, find_packages
setup(
name = 'project',
version = '1.0',
packages = find_packages(),
entry_points = {'scrapy': ['settings = dirbot.settings']},
)
*如何部署我的项目。我不知道我做错了什么,或者错过了一步?
由于
答案 0 :(得分:1)
从错误中看,您想要抓取的网站似乎给出了404错误,要么您放错了网站,要么存在一些配置错误。
关于python egg,有一个很好的答案,请在What is a Python egg?
查看关于setup.py:我知道它用于使用命令python setup.py install
编辑:似乎我对命令pip
感到困惑,对不起