Question

现在我有一个for循环遍历列表，通常这个列表长100-500个项目。在for循环中，每个项目都会打开一个新线程。所以现在我的代码看起来像这样：

    threads = []
    for item in items:
        t = threading.Thread(target=myfunction, args=(item,))
        threads.append(t)
        t.start()

但是我不想开始一个新线程，看到每个线程只需要几秒钟来执行myfunction。我想继续循环，在参数中调用每个项目的函数。但是一旦完成就关闭线程，并允许另一个接管。我要打开的最大线程数不少于3，不超过20.虽然如果它更容易，但该范围可能会有所不同。我只是不想在循环中的每个项目中打开一个新线程。

对于那些好奇的人，如果重要的话。 myfunction是我定义的一个函数，它使用urllib向站点发送post请求。

我是python的新手，但我并不陌生。抱歉没有问题。

Answer 1

我认为您正在寻找解决问题的线程池。

this question的答案详细说明了一些可能的解决方案。

最简单的一个（假设python3或pypi中的backport）是：

from concurrent.futures import ThreadPoolExecutor

executor = ThreadPoolExecutor(max_workers=10)
futures = []
for item in items:
    a = executor.submit(myfunction, item)
    futures.append(a)

这将使用10个线程对所有项执行myfunction。您可以稍后使用期货清单等待完成通话。

Answer 2

我相信你的问题在于缺少的功能。它可能是一些问题，我推荐你访问pythons主页：https://goo.gl/iAZuNX

1 - 1 / b

Answer 3

稍微修改您的代码，以包括在任何给定时间检查活动线程的数量：

threads = []
consumed_by_threads = 0
consumed_by_main = 0
for item in items:
    at = threading.activeCount()
    if at <= 20:
        t = threading.Thread(target=myfunction, args=(item,))
        threads.append(t)
        consumed_by_threads += 1
        t.start()
    else:
        print "active threads:", at
        consumed_by_main += 1
        myfunction(item)

print "consumed_by_threads: ", consumed_by_threads
print "consumed_by_main: ", consumed_by_main

# here the rest of your code, thread join, etc

注意：我只检查最大线程数。顺便说一句：它应该是21，因为主要线程包含在计数中（参见here并点击enumerate的链接）

Nota Bene：像往常一样，仔细检查特定应用程序的多线程优势，具体取决于您使用的python实现以及线程是cpu绑定还是I / O绑定。

Python循环中的线程但具有最大线程

3 个答案: