使用python多重处理在Monte Carlo模拟中计算平均值

时间:2018-10-16 02:51:28

标签: python multiprocessing

我一直在阅读有关Python中的多处理的信息(例如,我已经读过thisthisthisthis等等;我也阅读/观看了不同的文章网站/视频,例如thisthisthis等!)但是我仍然很困惑如何将多重处理应用于我的特定问题。我编写了一个简单的示例代码,用于使用蒙特卡洛模拟(Monte Carlo Simulation)计算随机生成的整数的平均值(我将随机整数存储在名为integers的变量中,以便最终可以计算均值;我还生成了随机数。 ndarrays并将它们存储在名为arrays的变量中,因为稍后我也需要对这些数组进行一些后处理):

import numpy as np

nMCS = 10 ** 8

integers = []
arrays = []
for i in range(nMCS):
    a = np.random.randint(0,10)
    b = np.random.rand(10,2)

    integers.append(a)
    arrays.append(b)

mean_val = np.average(integers)
# I will do post-processing on 'arrays' later!!

现在,我想利用计算机上的所有16个内核,因此不会按顺序生成随机数/数组,因此我可以加快处理速度。根据我所学的知识,我认识到我需要存储每个Monte Carlo Simulation的结果(即生成的随机整数和随机numpy.ndarray),然后使用进程间通信以便以后进行将所有结果存储在列表中。我写了不同的代码,但不幸的是它们都不起作用。例如,当我写这样的东西时:

import numpy as np
import multiprocessing

nMCS = 10 ** 6

integers = []
arrays = []

def monte_carlo():
    a = np.random.randint(0,10)
    b = np.random.rand(10,2)

if __name__ == '__main__':
    __spec__ = "ModuleSpec(name='builtins', loader=<class '_frozen_importlib.BuiltinImporter'>)" # this is because I am using Spyder!

    p1 = multiprocessing.Process(target = monte_carlo)

    p1.start()

    p1.join()

    for i in range(nMCS):

        integers.append(a)
        arrays.append(b)

我收到错误消息“未定义名称'a'”。那么有人可以帮我这个忙,告诉我如何同时生成尽可能多的随机整数/数组,然后将它们全部添加到列表中以进行进一步处理吗?

2 个答案:

答案 0 :(得分:2)

由于返回大量结果会导致在进程之间传播,因此建议将任务分成几部分并在返回之前进行处理。

n = 4
def monte_carlo():
    raw_result = []
    for j in range(10**4 / n):
        a = np.random.randint(0,10)
        b = np.random.rand(10,2)
        raw_result .append([a,b])
    result = processResult(raw_result) 
    #Your method to reduce the result return, 
    #let's assume the return value is [avg(a),reformed_array(b)]
    return result

if __name__ == '__main__':
    __spec__ = "ModuleSpec(name='builtins', loader=<class '_frozen_importlib.BuiltinImporter'>)" # this is because I am using Spyder!

    pool = Pool(processes=4) 
    #you can control how many processes here, for example multiprocessing.cpu_count()-1 to avoid completely blocking

    multiple_results = [pool.apply_async(monte_carlo, (i,)) for i in range(n)]

    data = [res.get() for res in multiple_results]
    #OR
    data = pool.map(monte_carlo, [i for i in range(n)])
    #Both return you a list of [avg(a),reformed_array(b)]

答案 1 :(得分:0)

简单错误。

a和b在您的函数中创建 它们不在您的主要范围内。您将需要从函数中返回它们

def monte_carlo():
    a = np.random.randint(0,10)
    b = np.random.rand(10,2)
    #create a return statement here. It may help if you put them into an array so you can return 2 value

if __name__ == '__main__':
__spec__ = "ModuleSpec(name='builtins', loader=<class 
    '_frozen_importlib.BuiltinImporter'>)" # this is because I am using Spyder!

    p1 = multiprocessing.Process(target = monte_carlo)

    p1.start()

    p1.join()
    #Call your function here and save the return to something
    for i in range(nMCS):

      integers.append(a) # paste here
      arrays.append(b) # and here

编辑:测试了代码,发现您从未调用过monte_carlo函数。 a和b现在可以正常工作,但是您有一个新错误要尝试解决。抱歉,由于我自己不了解该错误,因此我将无法为您提供帮助,但这是我对您的代码进行的编辑。

import numpy as np
import multiprocessing

nMCS = 10 ** 6

integers = []
arrays = []

def monte_carlo():
    a = np.random.randint(0,10)
    b = np.random.rand(10,2)
    temp = [a,b]
    return temp

if __name__ == '__main__':
__spec__ = "ModuleSpec(name='builtins', loader=<class 
'_frozen_importlib.BuiltinImporter'>)" # this is because I am using Spyder!

    p1 = multiprocessing.Process(target = monte_carlo())#added the extra brackets here

    p1.start()

    p1.join()

    for i in range(nMCS):
        array = monte_carlo()
        integers.append(array[0])
        arrays.append(array[1])

,这是我进行此编辑时遇到的错误。我自己仍在学习多重处理,因此其他人可能更适合于对此进行帮助

Process Process-6:
Traceback (most recent call last):
  File"c:\users\lunar\appdata\local\continuum\anaconda3\lib\multiprocessing\process.py", line 252, in _bootstrap
    self.run()
  File "c:\users\lunar\appdata\local\continuum\anaconda3\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
TypeError: 'list' object is not callable