列表索引超出范围,但该项目存在

时间:2018-09-06 07:14:46

标签: python list

我收到此错误,但我不明白为什么。列表索引超出范围,但该项目存在。

1 [u'http://(ip1):(port1)', u'http://(ip2):(port2)']
Exception in thread Thread-11:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "(path).py", line 57, in match_fetcher
    self.fetch_match(match)
  File "(path).py", line 65, in fetch_match
    response = self.http_get(url)
  File "(path).py", line 75, in http_get
    proxy = self.proxies.get_proxy()
  File "(path).py", line 51, in get_proxy
    proxy = self.proxies[self.index]
IndexError: list index out of range

代码:

def get_proxy(self):

    if self.index >= len(self.proxies):
        self.index = 0
    print self.index, self.proxies
    proxy = self.proxies[self.index]
    self.index += 1
    return proxy

我很困惑。这是什么问题?

编辑:

  

您正在使用线程,是否还有其他一个在操纵相同的数据? –   蒂埃里·拉休(Thierry Lathuille)

cat log | grep (proxy1)的输出

0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
1 [u'(proxy1):(port1)', u'(proxy2):(port2)']
0 [u'(proxy1):(port1)', u'(proxy2):(port2)']
(...)

1 个答案:

答案 0 :(得分:2)

看起来您有一个经典的线程同步问题,二十个线程正在访问共享资源(self.proxies和self.index)。在if self.index> ...检查之后,这些线程使self.index增加一,这使其超过列表的大小(index> 2)。

您需要具有一些同步机制来“保护”您的共享资源。一个很简单的就是锁:

from threading import Lock

# at your init method
self.lock = threading.Lock()

def get_proxy(self):
    self.lock.acquire() # will block if lock is already held
     ... access shared resource
    # basically the entire method in your case
    self.lock.release()

我建议您阅读有关线程和同步的更多信息,这是一个不错的教程: https://hackernoon.com/synchronization-primitives-in-python-564f89fee732