多处理星图池python中返回输出的顺序

时间:2019-09-27 06:16:08

标签: python multiprocessing threadpool

我的驱动器中存储了四个文件。这四个文件均为.npz格式。我编写了基于starmap pool的多处理python程序,该程序将使用4个处理器,每个程序将从目录中各个.npz文件中加载数据并返回4个列表中的数据。

这是代码的细分,我由此获得了预期的结果。我唯一的问题是了解获得此结果的顺序。

save_list_in  # A list which contains the address and names of the four .npz files to be               

file_name_1   # A a sub-string which is to be found in the items of save_list_in

现在这是要并行运行的功能。

def save (file_in, file_name_1, file_name_2):
    sample_1_seq = []
    sample_1_embed = []
    sample_2_embed = []
    sample_2_seq = []

    if file_name_1+"_seq"  in file_in:
        np_load_old = np.load
        np.load = lambda *a, **k: np_load_old(*a, allow_pickle=True, **k)
        loaded_1 = np.load(file_in)
        np.load = np_load_old
        print(file_in+" "+str(len(loaded_1.f.arr_0)))
        sample_1_seq=loaded_1.f.arr_0 

    if file_name_1 + "_vec" in file_in:   
        np_load_old = np.load
        np.load = lambda *a, **k: np_load_old(*a, allow_pickle=True, **k)
        loaded_2 = np.load(file_in)
        np.load = np_load_old
        print(file_in+" "+str(len(loaded_2.f.arr_0)))  
        sample_1_embed=loaded_2.f.arr_0 

    if file_name_2+"_seq"  in file_in:
       np_load_old = np.load
       np.load = lambda *a, **k: np_load_old(*a, allow_pickle=True, **k)
       loaded_3 = np.load(file_in)
       np.load = np_load_old
       print(file_in+" "+str(len(loaded_3.f.arr_0)))
       sample_2_seq=loaded_3.f.arr_0

    if file_name_2 + "_vec" in file_in:
        np_load_old = np.load
        np.load = lambda *a, **k: np_load_old(*a, allow_pickle=True, **k)
        loaded_4 = np.load(file_in)
        np.load = np_load_old
        print(file_in+" "+str(len(loaded_4.f.arr_0)))
        sample_2_embed=loaded_4.f.arr_0 

    return sample_1_seq, sample_1_embed, sample_2_seq, sample_2_embed

我正在使用以下方法调用上述函数。

pool = mp.Pool(4)   
sample_1_seq, sample_1_embed, sample_2_seq, sample_2_embed =zip(*pool.starmap(save, [(file_in, file_name_1, file_name_2) for file_in in save_list_in]))    
pool.close() 

结果: 我得到sample_1_seq, sample_1_embed, sample_2_seq, sample_2_embed的长度为4。 我想要的结果在sample_1_embed[1], sample_1_seq[0], sample_2_embed[2], sample_2_seq[3]中。 我想知道为什么我要得到结果sample_1_embed[1]而不是单个列表sample_1_embed

sample_1_embed[1] instead of sample_1_embed
sample_1_seq[0] instead of sample_1_seq
sample_2_embed[2] instead of sample_2_embed
sample_2_seq[3] instead of sample_2_seq

此外,该列表索引的顺序如何分配给它?例如,

sample_1_embed[1] instead of sample_1_embed[0] or sample_1_embed[2] or sample_1_embed[3]
sample_1_seq[0] instead of sample_1_seq[1] or sample_1_seq[2] or sample_1_seq[3]
sample_2_embed[2] instead of sample_2_embed[1] or sample_2_embed[1] or sample_2_embed[3]
sample_2_seq[3] instead of sample_2_seq[0] or sample_2_seq[1] or sample_2_seq[2]

0 个答案:

没有答案