我用google链接运行简单的scrapy蜘蛛,提供你好的搜索结果但是有错误
代码(蜘蛛代码)
import scrapy
import re
class LinsSpider(scrapy.Spider):
name = "lins"
allowed_domains = ["www.google.com"]
start_urls = ('https://www.google.co.in/?gfe_rd=cr&ei=78uyWPjFH8WL8Qe7kKf4BA#q=hello&*',)
def parse(self, response):
pagestr = "satanimant@gmail.com"
yield
{
'asin' : str(re.search("^[A-Za-z0-9\.\+_-]+@[A-Za-z0-9\._-]+\.[a-zA-Z]*$",pagestr).group(1).strip()),
}
错误是
2017-02-26 18:06:11 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-02-26 18:06:11 [scrapy] ERROR: Error downloading <GET http://www.google.com/>
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/scrapy/utils/defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/__init__.py", line 41, in download_request
return handler(request, spider)
File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/http11.py", line 44, in download_request
return agent.download_request(request)
File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/http11.py", line 211, in download_request
d = agent.request(method, url, headers, bodyproducer)
File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1631, in request
parsedURI.originForm)
File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1408, in _requestWithEndpoint
d = self._pool.getConnection(key, endpoint)
File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1294, in getConnection
return self._newConnection(key, endpoint)
File "/usr/local/lib/python2.7/dist-packages/twisted/web/client.py", line 1306, in _newConnection
return endpoint.connect(factory)
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/endpoints.py", line 788, in connect
EndpointReceiver, self._hostText, portNumber=self._port
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/_resolver.py", line 174, in resolveHostName
onAddress = self._simpleResolver.getHostByName(hostName)
File "/usr/lib/python2.7/dist-packages/scrapy/resolver.py", line 21, in getHostByName
d = super(CachingThreadedResolver, self).getHostByName(name, timeout)
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 276, in getHostByName
timeoutDelay = sum(timeout)
TypeError: 'float' object is not iterable
2017-02-26 18:06:11 [scrapy] INFO: Closing spider (finished)
2017-02-26 18:06:11 [scrapy] INFO: Dumping Scrapy stats:
请帮我解决这个问题,我有ubuntu 16.10
答案 0 :(得分:1)
我发现了问题。 这是扭曲的版本太高,你可以把它改成16.6.0,它运作成功!