我想在Python中编写一个类,它可以在给定的时间段内运行某个算法,然后停止并返回它在超时之前找到的最新uptodate值。
作为一个例子,我写了一个简单的类来查找在向量中找到最大值:
import time, multiprocessing
class AnytimeAlgorithm:
def __init__(self, vector):
self.vector = vector
self.result = 0
def update_forever(self):
while True:
i = random.randint(0, len(self.vector) - 1)
if self.vector[i] > self.result:
self.result = self.vector[i]
print("self", self, "result", self.result)
def result_after(self, seconds):
p = multiprocessing.Process(target=self.update_forever, name="update_forever", args=())
p.start()
p.join(seconds)
if p.is_alive():
p.terminate()
p.join()
print("self", self, "final result", self.result)
return self.result
if __name__ == "__main__":
import random, numpy as np
vector = np.random.rand(10000000)
maximizer = AnytimeAlgorithm(vector)
print(maximizer.result_after(0.01))
当我运行这个类时,它表明,正如预期的那样,结果会随着时间的推移而增加。但是,返回值始终为0!这是典型的输出:
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.420804014071
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.444555804935
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.852844624467
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.915336332491
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.964438367823
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.975029317702
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.975906346116
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.987784181209
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.996998726143
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.999480015562
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.999798469992
self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> final result 0
我的错误是什么?
答案 0 :(得分:2)
当你在Python中使用multiprocessing时,它会创建一个新的独立Python进程并运行你要求的任何内容。简化API以使其看起来像<{3}}这样的事实不应该让您感到困惑。在主过程中,您将创建一个AnytimeAlgorithm
对象。然后,创建一个运行函数的Process
;这会创建一个新进程并复制解释器的状态,因此您在新进程中也可以使用AnytimeAlgorithm
的副本。但是,这两个对象并不相同,它们甚至不会存在于同一个进程中,因此它们无法(直接)共享任何信息。您在新流程中对对象所做的更改只会影响该流程中的对象副本,而不会影响原始流程中的对象副本。
您可以查看有关如何在主流程和衍生流程之间共享信息的文档,例如使用multithreading或pipes, queues,这可能是一个不错的选择:
import multiprocessing
import random
import numpy as np
class AnytimeAlgorithm:
def __init__(self, vector):
self.vector = vector
self.result = multiprocessing.Value('d', 0.0)
def update_forever(self):
while True:
i = random.randint(0, len(self.vector) - 1)
if self.vector[i] > self.result.value:
self.result.value = self.vector[i]
print("self", self, "result", self.result.value)
def result_after(self, seconds):
p = multiprocessing.Process(target=self.update_forever, name="update_forever", args=())
p.start()
p.join(seconds)
if p.is_alive():
p.terminate()
p.join()
print("self", self, "final result", self.result.value)
return self.result.value
if __name__ == "__main__":
import random, numpy as np
vector = np.random.rand(10000000)
maximizer = AnytimeAlgorithm(vector)
print(maximizer.result_after(0.1))
输出:
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.01491873361800522
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.060776471658675835
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.7476611733129928
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9468162088782311
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9531978645650057
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9992671080742871
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.999293465561661
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9996894825552965
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9998511378366163
self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.999933119926922
self <__main__.AnytimeAlgorithm object at 0x00000195FBDC7908> final result 0.999933119926922
0.999933119926922
请注意,由于进程间同步访问,使用shared memory会产生额外的开销。阅读文档以了解锁定如何适用于此类,并考虑以最小化访问共享资源的方式编写算法(例如,使用在每次计算结束时编写的时态局部变量)。