我收到此错误,但我不明白为什么。列表索引超出范围,但该项目存在。
1 [u'http://(ip1):(port1)', u'http://(ip2):(port2)']
Exception in thread Thread-11:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "(path).py", line 57, in match_fetcher
self.fetch_match(match)
File "(path).py", line 65, in fetch_match
response = self.http_get(url)
File "(path).py", line 75, in http_get
proxy = self.proxies.get_proxy()
File "(path).py", line 51, in get_proxy
proxy = self.proxies[self.index]
IndexError: list index out of range
代码:
def get_proxy(self):
if self.index >= len(self.proxies):
self.index = 0
print self.index, self.proxies
proxy = self.proxies[self.index]
self.index += 1
return proxy
我很困惑。这是什么问题?
编辑:
您正在使用线程,是否还有其他一个在操纵相同的数据? – 蒂埃里·拉休(Thierry Lathuille)
cat log | grep (proxy1)
的输出
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
(...)
答案 0 :(得分:2)
看起来您有一个经典的线程同步问题,二十个线程正在访问共享资源(self.proxies和self.index)。在if self.index> ...检查之后,这些线程使self.index增加一,这使其超过列表的大小(index> 2)。
您需要具有一些同步机制来“保护”您的共享资源。一个很简单的就是锁:
from threading import Lock
# at your init method
self.lock = threading.Lock()
def get_proxy(self):
self.lock.acquire() # will block if lock is already held
... access shared resource
# basically the entire method in your case
self.lock.release()
我建议您阅读有关线程和同步的更多信息,这是一个不错的教程: https://hackernoon.com/synchronization-primitives-in-python-564f89fee732