我觉得我在这里遗漏了一些关于python进程限制的基本内容。我有一个屏幕刮板,应该每周一次进入受密码保护的站点,填写表格以更新现有记录,然后获取新记录。 (我正在使用Django实际插入记录,如果这很重要)。
我正在抓取的数据在一年中逐渐形成。所以在1月,这个过程相对较快。到8月份,除了添加了任何新记录外,还有数千行需要更新。
它今年的工作就像一个梦,但最近开始遇到这个追溯的连接错误:
Traceback (most recent call last):
File "douglasdivorces.py", line 42, in <module>
forms = [f for f in br.forms()]
File "/usr/local/lib/python2.6/dist-packages/mechanize-0.2.4.py2.6.egg/mechanize/_mechanize.py", line 420, in forms
return self._factory.forms()
File "/usr/local/lib/python2.6/dist-packages/mechanize-0.2.4-py2.6.egg/mechanize/_html.py", line 557, in forms
self._forms_factory.forms())
File "/usr/local/lib/python2.6/dist-packages/mechanize-0.2.4-py2.6.egg/mechanize/_html.py", line 237, in forms
_urlunparse=_rfc3986.urlunsplit,
File "/usr/local/lib/python2.6/dist-packages/mechanize-0.2.4-py2.6.egg/mechanize/_form.py", line 844, in ParseResponseEx
_urlunparse=_urlunparse,
File "/usr/local/lib/python2.6/dist-packages/mechanize-0.2.4-py2.6.egg/mechanize/_form.py", line 979, in _ParseFileEx
data = file.read(CHUNK)
File "/usr/local/lib/python2.6/dist-packages/mechanize-0.2.4-py2.6.egg/mechanize/_response.py", line 195, in read
data = self.wrapped.read(to_read)
File "/usr/lib/python2.6/socket.py", line 353, in read
data = self._sock.recv(left)
File "/usr/lib/python2.6/httplib.py", line 518, in read
return self._read_chunked(amt)
File "/usr/lib/python2.6/httplib.py", line 551, in _read_chunked
line = self.fp.readline()
File "/usr/lib/python2.6/socket.py", line 397, in readline
data = recv(1)
File "/usr/lib/python2.6/ssl.py", line 96, in <lambda>
self.recv = lambda buflen=1024, flags=0: SSLSocket.recv(self, buflen, flags)
File "/usr/lib/python2.6/ssl.py", line 217, in recv
return self.read(buflen)
File "/usr/lib/python2.6/ssl.py", line 136, in read
return self._sslobj.read(len)
socket.error: [Errno 104] Connection reset by peer
你是否可以解决这个错误,保持我的循环到位,直到问题解决?或者我应该采取另一种方法吗?
再一次,我的希望是我缺少学前班级的东西,所以我会免除你发布我的代码所带来的痛苦。如果不那么简单,请说出这个词,我会编辑问题以包含脚本。
非常感谢!非常好奇,听到什么导致我适合!
答案 0 :(得分:0)
socket.error:[Errno 104]通过对等方重置连接
即服务器不喜欢您的请求。也许他们改变了一些东西。