并发功能的并行处理

时间:2020-01-29 16:24:20

标签: python pandas parallel-processing

我试图找到一种使用不同方法对数据帧进行并行处理的方法,如本教程所示:https://www.youtube.com/watch?v=fKl2JW_qrso(最小> 18:26)。但是结果表明我出了点问题。该代码的思想是在数据框中创建一个新列['denominator'],其中每个字段的行总和来自“ basalareap”,“ basalareas”,“ basalaread”列。有人建议我在打印时得到这个奇怪的结果吗?此外,还有其他方法可以使并行化最有效吗?

import pandas as pd
import numpy as np
import concurrent.futures
from multiprocessing import cpu_count

np.random.seed(4)
layer = pd.DataFrame(np.random.randint(0,25,size=(10, 3)),
                  columns=list(['basalareap', 'basalareas', 'basalaread']))

def denom():
    layer['denominator'] = layer[["basalareap","basalareas","basalaread"]].sum(axis=1)

data_split = np.array_split(layer,cpu_count())


with concurrent.futures.ProcessPoolExecutor() as executor:
    results = [executor.submit(denom) for i in data_split]
print(results)

>>>print(results)
[<Future at 0x1b45e325108 state=finished raised BrokenProcessPool>, 
<Future at 0x1b45e357708 state=finished raised BrokenProcessPool>, 
<Future at 0x1b45e3577c8 state=finished raised BrokenProcessPool>, 
<Future at 0x1b45e357888 state=finished raised BrokenProcessPool>, 
<Future at 0x1b45e357948 state=finished raised BrokenProcessPool>, 
<Future at 0x1b45e357a48 state=finished raised BrokenProcessPool>, 
<Future at 0x1b45e357b08 state=finished raised BrokenProcessPool>, 
<Future at 0x1b45e357bc8 state=finished raised BrokenProcessPool>]

我的系统:Windows 10 python 3.7.4

1 个答案:

答案 0 :(得分:1)

这是一种使之工作的方法(使用示例数据):

<div class="container">
  <div class="box">
  </div>
</div>