matplotlib中的多进程绘图

时间:2018-07-18 19:02:52

标签: python pandas matplotlib multiprocessing python-multiprocessing

如何通过功能并行使用matplotlib可视化数据?即我想在并行流程中创建图形,然后在主流程中显示它们。

这里是一个例子:

# input data
import pandas as pd, matplotlib.pyplot as plt
df = pd.DataFrame(data={'i':['A','A','B','B'],
                       'x':[1.,2.,3.,4.],
                       'y':[1.,2.,3.,4.]})
df.set_index('i', inplace=True)
df.sort_index(inplace=True)

# function which creates a figure from the data
def Draw(df, i):
    fig = plt.figure(i)
    ax = fig.gca()
    df = df.loc[i,:]
    ax.scatter(df['x'], df['y'])
    return fig

def DrawWrapper(x): return Draw(*x)

# creating figures in parallel
from multiprocessing import Pool
poolSize = 2
with Pool(poolSize) as p:
    args = [(df,'A'), (df,'B')]
    figs = p.map(DrawWrapper, args)

# attempt to visualize the results
fig = plt.figure('A')
plt.show()
# FIXME: get "RuntimeError: main thread is not in main loop"

如何从工作进程中转移图形对象,以便能够在主流程中显示图形?

谢谢您的帮助!

[编辑:] 建议使用this thread

解决该问题

这是对应的代码:

# input data
import pandas as pd, matplotlib.pyplot as plt
df = pd.DataFrame(data={'i':['A','A','B','B'],
                       'x':[1.,2.,3.,4.],
                       'y':[1.,2.,3.,4.]})
df.set_index('i', inplace=True)
df.sort_index(inplace=True)

# function which creates a figure from the data
def Draw(df, i):
    fig = plt.figure(i)
    ax = fig.gca()
    df = df.loc[i,:]
    ax.scatter(df['x'], df['y'])
    plt.show()

# creating figures in parallel
from multiprocessing import Process
args = [(df,'A'), (df,'B')]
for a in args:
    p = Process(target=Draw, args=a)
    p.start()

# FIXME: result is the same (might be even worse since I do not 
# get any result which I could attempt to show):
# ...
# RuntimeError: main thread is not in main loop
# RuntimeError: main thread is not in main loop

我想念什么吗?

1 个答案:

答案 0 :(得分:1)

linked question的答案在if __name__ == "__main__":子句中隐藏了代码的开头。因此,以下内容应在这里工作。

import pandas as pd
import matplotlib.pyplot as plt

import multiprocessing
#multiprocessing.freeze_support() # <- may be required on windows

df = pd.DataFrame(data={'i':['A','A','B','B'],
                       'x':[1.,2.,3.,4.],
                       'y':[1.,2.,3.,4.]})
df.set_index('i', inplace=True)
df.sort_index(inplace=True)

# function which creates a figure from the data
def Draw(df, i):
    fig, ax  = plt.subplots()
    df = df.loc[i,:]
    ax.scatter(df['x'], df['y'])
    plt.show()

# creating figures in parallel
args = [(df,'A'), (df,'B')]

def multiP():
    for a in args:
        p = multiprocessing.Process(target=Draw, args=a)
        p.start()

if __name__ == "__main__":         
    multiP()