确保线程按创建顺序返回

时间:2017-03-14 02:16:20

标签: python python-3.x numpy multiprocessing

我有一个简单的问题,我希望有人可以帮助我。我有一个函数,通过for循环称为一系列线程。数组输出存储到Queue并在主程序中检索。线程工作很完美,不幸的是,数据数组的返回顺序与调用的顺序不同。我在这个问题中附上一个伪代码示例。是否有一些线程锁定机制允许创建和并行运行线程,但是按照它们的创建顺序返回。

import numpy as np
from multiprocessing import Process, Queue

def Thread_Func(container1,container2,container3,iterable,more_parameters, \
                Thread):
    Array1 = np.array([]); Array2 = np.array([])
    for i in range(len(iterable)):
        Array1 = np.append(some_operation)
        Array2 = np.append(some_other_operation)
    container1.put(Array1)
    container2.put(Array2)
    container3.put(Thread)

if __name__ == "__main__":
    container1 = Queue()
    container2 = Queue()
    container3 = Queue()
    Thread = 1
    for index in range(Number_of_Threads):
        p = Process(target=Thread_Func,args=(container1,container2,container3, \
                    iterable,more_parameters))
        Thread = Thread + 1
        p.daemon = True
        p.start()

    Array1       = np.array([])
    Array2       = np.array([])
    Thread_Array = np.array([])
    for index in range(Number_of_Threads)
        Array1 = np.append(Array1,container1.get())
        Array2 = np.append(Array2,container2.get())
        Thread_Array = np.append(Thread_Array,container3.get())
    p.join()

    print (Array1)
    print (Array2)
    print (Thread_Array)

在查看Array1的打印输出后,很明显线程是无序返回的,但是当我在运行两个线程后查看Thread_Array时,我看到了[2. 1.],什么时候我应该看[1. 2.]。我能做些什么来确保以正确的顺序返回线程吗?

1 个答案:

答案 0 :(得分:0)

多处理.Pool听起来像你需要的。默认设置是创建一个等于系统中处理器数量的池。您可以按提交的顺序提交工作并将未来结果存储在数组中,然后按所需顺序获取结果。

示例:

from multiprocessing import Pool
import time
import random

# Use a simple list of numbers for work
work = list(range(100))

# Run jobs in size of chunk.
chunk = 10

random.seed(time.time())

# This work is so simple we'll sleep to simulate longer work calculations
# and the threads will finish in random order.
def worker(arr):
    time.sleep(random.randint(1,5)/20) # .05 to .25 seconds
    return [x*10 for x in arr]

if __name__ == '__main__':
    pool = Pool()
    results = []

    # Break the work into chunks, and submit jobs to run asynchronously.
    # The callback will print the index of the job as it finishes.
    for i in range(0,len(work),chunk):
        results.append(pool.apply_async(
            worker,
            args=(work[i:i+chunk],),
            callback=lambda x,i=i:print(i)))

    # Fetch the results in the order submitted.
    for result in results:
        print(result.get())

观察线程完成无序。获取第一个结果直到完成,然后按顺序打印结果,直到完成的作业。然后它再次停滞等待50范围完成。

输出:

10
30
20
40
0
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]
[100, 110, 120, 130, 140, 150, 160, 170, 180, 190]
[200, 210, 220, 230, 240, 250, 260, 270, 280, 290]
[300, 310, 320, 330, 340, 350, 360, 370, 380, 390]
[400, 410, 420, 430, 440, 450, 460, 470, 480, 490]
80
90
70
60
50
[500, 510, 520, 530, 540, 550, 560, 570, 580, 590]
[600, 610, 620, 630, 640, 650, 660, 670, 680, 690]
[700, 710, 720, 730, 740, 750, 760, 770, 780, 790]
[800, 810, 820, 830, 840, 850, 860, 870, 880, 890]
[900, 910, 920, 930, 940, 950, 960, 970, 980, 990]