我正在尝试抓取一个网站。
url = "http://www.hellotrade.com/business/"
headers = {
"User-Agent":"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36",
'Connection':'close'
}
res = requests.get(url, headers = headers, timeout = 30)
它在开始时运行完美,但在运行一段时间之后,它会显示错误消息。
Traceback (most recent call last):
File "C:\Users\millshih\Desktop\hellotrade.py", line 32, in <module> res = s.get(url, headers = headers, timeout = 30)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 508, in request resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests\adapters.py", line 508, in send
raise ConnectionError(e, request=request)requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.hellotrade.com', port=80): Max retries exceeded with url: /business/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0294F3D0>: Failed to establish a new connection: [Errno 10060] \xb3s\xbdu\xb9\xc1\xb8\xd5\xa5\xa2\xb1\xd1\xa1A\xa6]\xac\xb0\xb3s\xbdu\xb9\xef\xb6H\xa6\xb3\xa4@\xacq\xae\xc9\xb6\xa1\xa8\xc3\xa5\xbc\xa5\xbf\xbdT\xa6^\xc0\xb3\xa1A\xa9\xce\xacO\xb3s\xbdu\xab\xd8\xa5\xdf\xa5\xa2\xb1\xd1\xa1A\xa6]\xac\xb0\xb3s\xbdu\xaa\xba\xa5D\xbe\xf7\xb5L\xaak\xa6^\xc0\xb3\xa1C',))
在出现此错误后,我必须等到第二天再次运行,但在几分钟后运行会遇到同样的问题。
但是我的浏览器在浏览这个网站时做得很好。因此,这意味着我的IP不会被本网站禁止。
我尝试了this以及其他一些来自互联网的方法,但它根本无法运作。
我想知道可能是因为我的网络连接问题?