Question

我几乎是python的新手，我一直在研究一个解析任何给定目录中的csv文件的脚本。在我实现了一个队列和线程之后，即使队列中仍有项目，我仍然坚持这个线程没有获取新工作的问题。例如，如果我将最大线程数指定为3，并且队列中有6个项目，则线程将获取3个文件，处理它们，然后无限期地挂起。我可能只是在概念上误解了多线程过程。

ETA：出于安全原因，已删除了部分代码。

q = Queue.Queue()
threads = []

for file in os.listdir(os.chdir(arguments.path)):
            if (file.endswith('.csv')):
                q.put(file)
        for i in range(max_threads):
            worker = threading.Thread(target=process, name='worker-{}'.format(thread_count))
            worker.setDaemon(True)
            worker.start()
            threads.append(worker)
            thread_count += 1
        q.join()

def process():
        with open(q.get()) as csvfile:
            #do stuff
            q.task_done()

Answer 1

您忘记了在您的帖子中循环队列 ...

def process():
    while True: #<---------------- keep getting stuff from the queue
         with open(q.get()) as csvfile:
         #do stuff
             q.task_done()

那就是说，你可能正在重新发明轮子，尝试使用线程池：

from concurrent.futures import ThreadPoolExecutor

l = [] # a list should do it ...
for file in os.listdir(arguments.path):
        if (file.endswith('.csv')):
            l.append(file)

def process(file):

    return "this is the file i got %s" % file

with ThreadPoolExecutor(max_workers=4) as e:
    results = list(e.map(process, l))

线程没有从Queue获得更多工作

1 个答案: