任何时候使用python多处理的算法

时间:2018-02-21 09:43:02

标签: python multiprocessing

我想在Python中编写一个类,它可以在给定的时间段内运行某个算法,然后停止并返回它在超时之前找到的最新uptodate值。

作为一个例子,我写了一个简单的类来查找在向量中找到最大值:

import time, multiprocessing

class AnytimeAlgorithm:
    def __init__(self, vector):
        self.vector = vector
        self.result = 0

    def update_forever(self):
        while True:
            i = random.randint(0, len(self.vector) - 1)
            if self.vector[i] > self.result:
                self.result = self.vector[i]
                print("self", self, "result", self.result)

    def result_after(self, seconds):
        p = multiprocessing.Process(target=self.update_forever, name="update_forever", args=())
        p.start()
        p.join(seconds)
        if p.is_alive():
            p.terminate()
        p.join()
        print("self", self, "final result", self.result)
        return self.result


if __name__ == "__main__":
    import random, numpy as np
    vector = np.random.rand(10000000)
    maximizer = AnytimeAlgorithm(vector)
    print(maximizer.result_after(0.01))

当我运行这个类时,它表明,正如预期的那样,结果会随着时间的推移而增加。但是,返回值始终为0!这是典型的输出:

self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.420804014071
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.444555804935
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.852844624467
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.915336332491
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.964438367823
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.975029317702
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.975906346116
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.987784181209
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.996998726143
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.999480015562
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.999798469992
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> final result 0

我的错误是什么?

1 个答案:

答案 0 :(得分:2)

当你在Python中使用multiprocessing时,它会创建一个新的独立Python进程并运行你要求的任何内容。简化API以使其看起来像<{3}}这样的事实不应该让您感到困惑。在主过程中,您将创建一个AnytimeAlgorithm对象。然后,创建一个运行函数的Process;这会创建一个新进程并复制解释器的状态,因此您在新进程中也可以使用AnytimeAlgorithm的副本。但是,这两个对象并不相同,它们甚至不会存在于同一个进程中,因此它们无法(直接)共享任何信息。您在新流程中对对象所做的更改只会影响该流程中的对象副本,而不会影响原始流程中的对象副本。

您可以查看有关如何在主流程和衍生流程之间共享信息的文档,例如使用multithreadingpipes, queues,这可能是一个不错的选择:

import multiprocessing
import random
import numpy as np

class AnytimeAlgorithm:
    def __init__(self, vector):
        self.vector = vector
        self.result = multiprocessing.Value('d', 0.0)

    def update_forever(self):
        while True:
            i = random.randint(0, len(self.vector) - 1)
            if self.vector[i] > self.result.value:
                self.result.value = self.vector[i]
                print("self", self, "result", self.result.value)

    def result_after(self, seconds):
        p = multiprocessing.Process(target=self.update_forever, name="update_forever", args=())
        p.start()
        p.join(seconds)
        if p.is_alive():
            p.terminate()
        p.join()
        print("self", self, "final result", self.result.value)
        return self.result.value


if __name__ == "__main__":
    import random, numpy as np
    vector = np.random.rand(10000000)
    maximizer = AnytimeAlgorithm(vector)
    print(maximizer.result_after(0.1))

输出:

self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.01491873361800522
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.060776471658675835
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.7476611733129928
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9468162088782311
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9531978645650057
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9992671080742871
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.999293465561661
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9996894825552965
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9998511378366163
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.999933119926922
self <__main__.AnytimeAlgorithm object at 0x00000195FBDC7908> final result 0.999933119926922
0.999933119926922

请注意,由于进程间同步访问,使用shared memory会产生额外的开销。阅读文档以了解锁定如何适用于此类,并考虑以最小化访问共享资源的方式编写算法(例如,使用在每次计算结束时编写的时态局部变量)。