当有许多线程时,队列不处理所有元素

时间:2016-08-25 18:47:45

标签: python multithreading python-2.7

我注意到当我有很多线程从队列中提取元素时,处理的元素数量少于我放入队列的数量。这是零星的,但似乎发生在我运行以下代码的大约一半时间。

#!/bin/env python

from threading import Thread
import httplib, sys
from Queue import Queue
import time
import random

concurrent = 500
num_jobs = 500

results = {}

def doWork():
    while True:
        result = None
        try:
            result = curl(q.get())
        except Exception as e:
            print "Error when trying to get from queue: {0}".format(str(e))

        if results.has_key(result):
            results[result] += 1
        else:
            results[result] = 1

        try:
            q.task_done()
        except:
            print "Called task_done when all tasks were done"

def curl(ourl):
    result = 'all good'
    try:
        time.sleep(random.random() * 2)
    except Exception as e:
        result = "error: %s" % str(e)
    except:
        result = str(sys.exc_info()[0])
    finally: 
        return result or "None"

print "\nRunning {0} jobs on {1} threads...".format(num_jobs, concurrent)

q = Queue()

for i in range(concurrent):
    t = Thread(target=doWork)
    t.daemon = True
    t.start()

for x in range(num_jobs):
    q.put("something")

try:
    q.join()
except KeyboardInterrupt:
    sys.exit(1)

total_responses = 0
for result in results:
    num_responses = results[result]
    print "{0}: {1} time(s)".format(result, num_responses)
    total_responses += num_responses

print "Number of elements processed: {0}".format(total_responses)

1 个答案:

答案 0 :(得分:1)

蒂姆·彼得斯在评论中击中了头部。问题是跟踪结果是有线程的,并且不受任何类型的互斥锁的保护。这允许这样的事情发生:

thread A gets result: "all good"
thread A checks results[result]
thread A sees no such key
thread A suspends  # <-- before counting its result
thread B gets result: "all good"
thread B checks results[result]
thread B sees no such key
thread B sets results['all good'] = 1
thread C ...
thread C sets results['all good'] = 2
thread D ...
thread A resumes  # <-- and remembers it needs to count its result still
thread A sets results['all good'] = 1  # resetting previous work!

更典型的工作流可能有一个主线程正在侦听的结果队列。

workq = queue.Queue()
resultsq = queue.Queue()

make_work(into=workq)
do_work(from=workq, respond_on=resultsq)
# do_work would do respond_on.put_nowait(result) instead of
#   return result

results = {}

while True:
    try:
        result = resultsq.get()
    except queue.Empty:
        break  # maybe? You'd probably want to retry a few times
    results.setdefault(result, 0) += 1