使用多处理在Python进程之间共享数据的问题

时间:2016-11-17 15:16:39

标签: python subprocess python-multiprocessing

我已经看过几篇关于此的帖子,所以我知道这样做很简单,但我似乎很快就会出现。我不确定是否需要创建工作池,或使用Queue类。基本上,我希望能够创建多个进程,每个进程都自动执行(这就是为什么它们从Agent超类继承)。

在我的主循环的随机滴答中,我想更新每个代理。我在主循环和代理的运行循环中使用time.sleep不同的值来模拟不同的处理器速度。

这是我的代理超类:

# Generic class to handle mpc of each agent
class Agent(mpc.Process):
  # initialize agent parameters
  def __init__(self,):
    # init mpc
    mpc.Process.__init__(self)
    self.exit = mpc.Event()

  # an agent's main loop...generally should be overridden
  def run(self):
    while not self.exit.is_set():
      pass
    print "You exited!"

  # safely shutdown an agent
  def shutdown(self):
    print "Shutdown initiated"
    self.exit.set()

  # safely communicate values to this agent
  def communicate(self,value):
    print value

特定代理的子类(模拟HVAC系统):

class HVAC(Agent):
  def __init__(self, dt=70, dh=50.0):
    super(Agent, self).__init__()
    self.exit = mpc.Event()

    self.__pref_heating     = True
    self.__pref_cooling     = True
    self.__desired_temperature = dt
    self.__desired_humidity    = dh

    self.__meas_temperature = 0
    self.__meas_humidity    = 0.0
    self.__hvac_status      = "" # heating, cooling, off

    self.start()

  def run(self): # handle AC or heater on 
    while not self.exit.is_set():
      ctemp = self.measureTemp()
      chum  = self.measureHumidity()

      if (ctemp < self.__desired_temperature):
        self.__hvac_status = 'heating'
        self.__meas_temperature += 1

      elif (ctemp > self.__desired_temperature):
        self.__hvac_status = 'cooling'
        self.__meas_temperature += 1

      else:
        self.__hvac_status = 'off'
      print self.__hvac_status, self.__meas_temperature


      time.sleep(0.5)


    print "HVAC EXITED"

  def measureTemp(self):
    return self.__meas_temperature
  def measureHumidity(self):
    return self.__meas_humidity

  def communicate(self,updates):
    self.__meas_temperature = updates['temp']
    self.__meas_humidity    = updates['humidity']
    print "Measured [%d] [%f]" % (self.__meas_temperature,self.__meas_humidity)

我的主循环:

if __name__ == "__main__":
  print "Initializing subsystems"
  agents = {}
  agents['HVAC'] = HVAC()

  # Run simulation
  timestep = 0
  while timestep < args.timesteps:
    print "Timestep %d" % timestep

    if timestep % 10 == 0:
      curr_temp = random.randrange(68,72)
      curr_humidity = random.uniform(40.0,60.0)
      agents['HVAC'].communicate({'temp':curr_temp, 'humidity':curr_humidity})

    time.sleep(1)
    timestep += 1

  agents['HVAC'].shutdown()
  print "HVAC process state: %d" % agents['HVAC'].is_alive()

所以问题在于,每当我在主循环中运行agents['HVAC'].communicate(x)时,我都会看到值被传递到HVAC循环中的run子类中(因此它会打印出来)正确收到价值)。但是,该值永远不会成功存储。

所以典型的输出如下:

Initializing subsystems
Timestep 0
Measured [68] [56.948675]
heating 1
heating 2
Timestep 1
heating 3
heating 4
Timestep 2
heating 5
heating 6

实际上,一旦测量[68]出现,内部存储值应更新为输出68(不加热1,加热2等)。如此有效,HVAC的self .__ meas_temperature没有得到正确更新。

编辑:经过一番研究后,我意识到我并不一定了解幕后发生的事情。每个子进程都使用自己的虚拟内存块进行操作,并且完全从这种方式共享的数据中抽象出来,因此传入值不会起作用。我的新问题是,我不一定确定如何与多个流程共享全局价值。

我正在查看Queue或JoinableQueue包,但我不一定确定如何将Queue传递给我所拥有的超类设置类型(特别是mpc.Process.__init__(self)调用)。

如果我可以让多个代理从队列中读取值而不将其从队列中拉出来,那么另一个问题就是如此?例如,如果我想与多个代理共享temperature值,那么Queue会为此工作吗?

Pipe v Queue

1 个答案:

答案 0 :(得分:1)

这是一个建议的解决方案,假设您需要以下内容:

  • 控制工人生命周期的集中管理者/主要流程
  • 工作流程执行自包含的操作,然后将结果报告给经理和其他流程

在我展示它之前,为了记录,我想说一般情况下除非你受CPU限制multiprocessing并不合适,主要是因为增加了复杂性,你可能会更好使用不同的高级异步框架。另外,你应该使用python 3,它会好得多!

那就是multiprocessing.Manager,使用multiprocessing可以很容易地做到这一点。我已经在python 3中完成了这个,但我认为任何事情都不应该在python 2中“正常工作”,但我没有检查过。

from ctypes import c_bool
from multiprocessing import Manager, Process, Array, Value
from pprint import pprint
from time import sleep, time


class Agent(Process):

    def __init__(self, name, shared_dictionary, delay=0.5):
        """My take on your Agent.

        Key difference is that I've commonized the run-loop and used
        a shared value to signal when to stop, to demonstrate it.
        """
        super(Agent, self).__init__()
        self.name = name

        # This is going to be how we communicate between processes.
        self.shared_dictionary = shared_dictionary

        # Create a silo for us to use.
        shared_dictionary[name] = []
        self.should_stop = Value(c_bool, False)

        # Primarily for testing purposes, and for simulating 
        # slower agents.
        self.delay = delay

    def get_next_results(self):
        # In the real world I'd use abc.ABCMeta as the metaclass to do 
        # this properly.
        raise RuntimeError('Subclasses must implement this')

    def run(self):
        ii = 0
        while not self.should_stop.value:
            ii += 1
            # debugging / monitoring
            print('%s %s run loop execution %d' % (
                type(self).__name__, self.name, ii))

            next_results = self.get_next_results()

            # Add the results, along with a timestamp.
            self.shared_dictionary[self.name] += [(time(), next_results)]
            sleep(self.delay)

    def stop(self):
        self.should_stop.value = True
        print('%s %s stopped' % (type(self).__name__, self.name))


class HVACAgent(Agent):
    def get_next_results(self):
        # This is where you do your work, but for the sake of
        # the example just return a constant dictionary.
        return {'temperature': 5, 'pressure': 7, 'humidity': 9}


class DumbReadingAgent(Agent):
    """A dumb agent to demonstrate workers reading other worker values."""

    def get_next_results(self):
        # get hvac 1 results:
        hvac1_results = self.shared_dictionary.get('hvac 1')
        if hvac1_results is None:
            return None

        return hvac1_results[-1][1]['temperature']

# Script starts.
results = {}

# The "with" ensures we terminate the manager at the end.
with Manager() as manager:

    # the manager is a subprocess in its own right. We can ask
    # it to manage a dictionary (or other python types) for us
    # to be shared among the other children.
    shared_info = manager.dict()

    hvac_agent1 = HVACAgent('hvac 1', shared_info)
    hvac_agent2 = HVACAgent('hvac 2', shared_info, delay=0.1)
    dumb_agent = DumbReadingAgent('dumb hvac1 reader', shared_info)

    agents = (hvac_agent1, hvac_agent2, dumb_agent)

    list(map(lambda a: a.start(), agents))

    sleep(1)

    list(map(lambda a: a.stop(), agents))
    list(map(lambda a: a.join(), agents))

    # Not quite sure what happens to the shared dictionary after
    # the manager dies, so for safety make a local copy.
    results = dict(shared_info)

pprint(results)