Python - 队列,管理器和多处理的奇怪行为

时间:2015-10-28 14:34:27

标签: python queue python-multiprocessing

在使用multiprocessing模块同时下载我的宠物项目时,我遇到了一个涉及Queue对象生成的multiprocessing.Manager()个对象的奇怪行为。

根据我将Queue对象(通过Manager生成)放在另一个Queue对象(也是通过Manager生成)中的方式,我得到了不同的行为,我的理解,同样的事情。 这是一个最低限度的工作示例:

import multiprocessing
import Queue

def work(inbound_queue, keep_going):
    while keep_going.value == 1:
        try:
            outbound_queue = inbound_queue.get(False) # this fails in case 3
            #do some work
            outbound_queue.put("work done!")
        except Queue.Empty:
            pass #this is busy wait of course, it's just an example

class Weird:
    def __init__(self):
        self.manager = multiprocessing.Manager()
        self.queue = self.manager.Queue()
        self.keep_going = multiprocessing.Value("i", 1)
        self.worker = multiprocessing.Process(target = work, args = (self.queue, self.keep_going))
        self.worker.start()
    def stop(self): #close and join the second process
        self.keep_going.value = 0
        self.worker.join()
    def queueFromOutside(self, q):
        self.queue.put(q)
        return q
    def queueFromNewManager(self):
        q = multiprocessing.Manager().Queue()
        self.queue.put(q)
        return q
    def queueFromOwnManager(self):
        q = self.manager.Queue()
        self.queue.put(q)
        return q

if __name__ == '__main__':
    instance = Weird()
    # CASE 1
    queue = multiprocessing.Manager().Queue()
    q1 = instance.queueFromOutside(queue) # Works fine
    print "1: ", q1.get()

    # CASE 2
    q2 = instance.queueFromNewManager()   # Works fine
    print "2: ", q2.get()

    # CASE 3
    q3 = instance.queueFromOwnManager()   # Error
    print "3: ", q3.get()

    instance.stop() #sadly never called :(

及其输出(python 2.7.10 x86,windows)。

主要输出:

1:  work done!
2:  work done!
3:

然后工作进程崩溃,让q3.get()挂起。

工作流程的输出:

Process Process-2:
Traceback (most recent call last):
  File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "J:\Dropbox\Python\queues2.py", line 7, in work
    outbound_queue = inbound_queue.get(False) # this fails in case 3
  File "<string>", line 2, in get
  File "C:\Python27\lib\multiprocessing\managers.py", line 774, in _callmethod
    raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Unserializable message: ('#RETURN', <Queue.Queue instance at 0x025A22B0>)
---------------------------------------------------------------------------

所以问题是:为什么第三种情况导致RemoteError

提供的示例与实际项目中的代码结构不相似,但我确实将队列发送到正在运行的进程,如果我使用方法#1和#2,它就可以正常工作。虽然使用方法#3会很好,因为它可以省去每次获得Manager的麻烦,这可能会花费很长的时间(机器上约100毫秒我从右边开始工作)现在)。

这个问题出于好奇,因为我还在了解multiprocessing模块中所有很酷的东西。

更新,尝试澄清问题:,案例3(queueFromOwnManager)为什么self.manager.Queue()创建一个曾经放入self.queue的队列,不能使用self.queue.get()检索,而可以检索使用multiprocessing.Manager().Queue()创建的队列?执行3个案件的顺序无关紧要。理想情况下,instance.queue在3个示例中的3个方法调用之前和之后都将为空。

更新2:使得这个问题与我在代码中实际做的更相似

1 个答案:

答案 0 :(得分:0)

更新了答案,我在main中添加了代码,用于从列表ls中填充和打印放置在队列self.queue中的项目。我还添加了一个语句,可用于从外部函数中的ls中检索self.queue中的项目。

import multiprocessing
import Queue

def work(inbound_queue, keep_going):
    while keep_going.value == 1:
        try:
            pass
            #outbound_queue = inbound_queue.get(False) # this fails in case 3 #<--- an error here, wherefrom does it get truthvalues?
            #do some work
            #outbound_queue.put("work done!")
        except: #Queue.Empty:                            <--- self.queue.Empty(): when instantiated below
            pass #this is busy wait of course, it's just an example

class Weird:
    def __init__(self):
        self.manager = multiprocessing.Manager()
        self.queue = self.manager.Queue()
        self.queue2 = multiprocessing.Manager().Queue()
        self.keep_going = multiprocessing.Value("i", 1)
        self.worker = multiprocessing.Process(target = work, args = (self.queue, self.keep_going))
        self.worker.start()
    def stop(self): #close and join the second process
        self.keep_going.value = 0
        #self.worker.join()
    def queueFromOutside(self, q):

        ls = [1,2,3,4,5]
        # populate self.queue with elements from list ls
        for i in ls:
          self.queue.put(i)
        return self.queue
    def queueFromNewManager(self):
         #q = multiprocessing.Manager().Queue()  <---- note that you state that manager and not "queue"
                                                      # is the name of the queue in this step,
                                                      # therefor errormsg, at this step manager
                                                     # is empty and self.queue has been given ls


        ls = [5,6,7,8]
        # populate self.queue with elements from list ls
        for i in ls:

          self.queue.put(i)
        return self.queue


    def queueFromOwnManager(self):
        q = self.manager.Queue()


        ls = [5,6,7,8]
        # populate self.queue with elements from list ls
        for i in ls:
          self.queue.put(i)
        return self.queue

    def wait_completion(self):     #<---- function that waits until tasks are done, and joins all
                                            # tasks as a last step, check docs how to add tasks and data to manager
        """Wait for completion of all the tasks in the queue"""
        self.tasks.join()

if __name__ == '__main__':
    instance = Weird()
    # CASE 1

    q1 = instance.queueFromOutside(instance.queue2) # Works fine
    print "1: ", q1.get()

    #this code gets data from instance.queue in external functions
    if not instance.queue.empty():

        item = instance.queue.get(True)
        print item,"item"

    # CASE 2
    q2 = instance.queueFromNewManager()   # Works fine
    print "2: ", q2.get()

    # CASE 3
    q3 = instance.queueFromOwnManager()   # Error
    print "3: ", q3.get()

    instance.stop() #sadly never called :(

    # when all tasks done  wait_completion(self) can be printed here to join all tasks