一次最多执行五个线程

时间:2014-07-03 09:17:15

标签: python multithreading python-2.7

我有一个X线程列表(可能超过100个) 我希望同时运行不超过五个。

我想出了这个:

import os
from os import listdir
from os.path import isfile, join
import shutil
import Image
import math
import threading

CAMERA_NUMBER = 21 #there is 21 cameras, from 1 to 21
ORDERED_SCAN_OUTPUT_FOLDER = "scanData"
PRETTY_PRINT_OUTPUT_FOLDER = "preview"
ROTATION_ANGLE = 90
RATIO = 0.4
IMAGE_PER_ROW = 7
MAX_THREAD = 5

def getNumberOfScanToProcess(absolute_folder):
    folder_list = get_all_folders_from(absolute_folder)
    return len(folder_list)

    """ you have a thread list and you only want to run them 5 by 5, use this """
def runThreadListBlockByBlock(thread_list, number_of_simultanious_thread):
    print ""
    print "lauching thread list by run of " + str(number_of_simultanious_thread) + " Thread(s)"
    thread_counter = 0
    initial_count = 0
    for thread_id in range(0, len(thread_list)):
        print "lauching thread " + str(thread_id)
        thread_list[thread_id].start()
        thread_counter = thread_counter+1
        if initial_count+number_of_simultanious_thread == thread_counter:
            for thread_number in range(initial_count, thread_counter):
                print "waiting for thread " + str(thread_number)
                thread_list[thread_number].join()
            initial_count = thread_counter

class prettyPrintThread(threading.Thread):
    def __init__(self, folder_to_process, ratio, rotation_angle, image_per_row, output_folder, thread_id):
        super(prettyPrintThread, self).__init__()
        self.ratio = ratio
        self.rotation_angle = rotation_angle
        self.image_per_row = image_per_row
        self.output_folder = output_folder
        self.thread_id = thread_id
        self.folder_to_process = folder_to_process

    def run(self):
        pretty_print(self.folder_to_process, self.ratio, self.rotation_angle, self.image_per_row, self.output_folder, self.thread_id)

script_absolute_folder = os.path.abspath(os.getcwd())
stored_scan_absolute_folder = join(script_absolute_folder, ORDERED_SCAN_OUTPUT_FOLDER)
scan_count = getNumberOfScanToProcess(stored_scan_absolute_folder)

thread_list = []
#Making the thread list
for thread_number in range(0, scan_count):
    print "preparing thread number " + str(thread_number)
    thread_list.append(prettyPrintThread(join(ORDERED_SCAN_OUTPUT_FOLDER, str(thread_number).zfill(4)), RATIO, ROTATION_ANGLE, IMAGE_PER_ROW, PRETTY_PRINT_OUTPUT_FOLDER, 1))
#launch 5 thread, wait for them to finish then launch the 5 other and so on.
runThreadListBlockByBlock(thread_list, MAX_THREAD)

但问题是等待5个线程完成, 一旦其中一个线程完成,我就可以启动另一个线程。

有没有像java中的事件/监听器那样提出某种方式 线程完成后标记?

由于

2 个答案:

答案 0 :(得分:3)

执行此操作的最佳方法可能是使用Semaphore对象。创建一个初始值为5的信号量,然后让你的主线程(控制其他线程的那个)在启动一个线程之前调用信号量的acquire()方法(可能是在一个循环中)。这将在五个线程启动时阻止。

线程应该在完成时调用信号量的release()方法,这将通过允许它进行acquire()调用来唤醒主线程,然后它将启动另一个线程,依此类推直到没有什么可做的。

完成启动线程后,请小心join()使用最终线程,以确保它们在退出主线程之前终止。

BoundedSemaphore还允许您检测线程释放的次数多于获取的错误。

答案 1 :(得分:2)

使用concurrent.futures库中的ThreadPoolExecutor(已将其移植到Python 2.7)。

使用情况如下:

executor = ThreadPoolExecutor(max_workers=5)

futures = [
    executor.submit(callable_which_gets_the_job_done, some_argument=foo)
    for foo in bar
]

for foo, future in zip(bar, futures):
    print "callable_which_gets_the_job_done(some_argument=%s) returned %s" % (
        foo, 
        future.result(),
    )

执行人将使用callable_which_gets_the_job_than(some_argument=foo)中的foo值运行bar。每次调用都在一个单独的线程中,同时运行的线程数不会超过5个。