我正在使用生产者 - 消费者模型实现图像下载程序。一个线程负责生成(url,filename)对并将它们放入队列中。我希望MAX_THREADS线程选择对并开始下载。这是我的主题:
class Extractor(Thread):
def __init__(self, group=None, target=None, name=None,
args=(), kwargs=None, verbose=None, items=None):
super(Extractor, self).__init__()
self.target = target
self.name = name
self.items = items
def run(self):
while True:
for item in self.items:
if not QUEUE.full():
QUEUE.put_nowait(extract(item))
logging.debug('Putting ' + str(item) + ' : ' + str(QUEUE.qsize()) + ' items in queue')
class Downloader(Thread):
def __init__(self, group=None, target=None, name=None,
args=(), kwargs=None, verbose=None):
super(Downloader, self).__init__()
self.target = target
self.name = name
self.seen = set()
def run(self):
while True:
if not QUEUE.empty():
pair = QUEUE.get_nowait()
# I have seen the URL
if pair[0] in self.seen:
continue
else:
# Never seen it before
self.seen.add(pair[0])
logging.debug('Downloading ' + str(pair[1]) + ' : ' + str(QUEUE.qsize()) + ' items in queue')
download_one_pic(pair)
if __name__ == '__main__':
items = None
items = crawl('username__', items)
worker_threads = []
producer = Extractor(name='Extractor', items=items)
producer.daemon = True
producer.start()
consumer = Downloader(name='Downloader[1]')
consumer2 = Downloader(name='Downloader[2]')
worker_threads.append(consumer)
worker_threads.append(consumer2)
for thread in worker_threads:
thread.start()
thread.join()
队列的最大大小为50,我希望 Producer 线程无论其他线程如何运行,所以我有妖魔化。有一件事很奇怪, consumer2 主题永远不会开始,我也不知道为什么。在我的日志中,只有Downloader[1]
完成工作,队列在49到50之间波动,所以我知道Downloader[2]
永远不会开始。
答案 0 :(得分:0)
在线程上调用join()等待它返回之前完成。代码末尾的那个循环只会执行一次因为Downloader类永远循环。在一个循环中调用start,然后再次循环它们以join()并在所有线程启动后等待