使用python进行多处理队列

时间:2014-06-01 20:21:29

标签: python queue multiprocessing pool

我正在尝试创建一个基本脚本,以利用多处理来处理充满对象的队列,并在每个对象上调用一个方法。

我理解多处理和池等原理。请参阅下面的内容:

from multiprocessing import Queue, Pool
from object import obj
import time

currentTime = time.time() #used to work out how long it takes for the script to complete

work_queue = Queue()

#create some objects to work with
a = obj('obj1')
b = obj('obj2')

#put them in the queue
work_queue.put(a)
work_queue.put(b)

def doFunction(q):
    temp = q.get()
    print temp.getVar()

if __name__ == '__main__':
    pool = Pool(2) #Create a pool with two processes
    pool.map(doFunction, work_queue, 1)
    pool.close()
    pool.join()

print time.time() - currentTime #prints out the time taken to run script

这会引发错误:

Traceback (most recent call last):
  File "/home/adam/workspace/MultiProcessing/test2.py", line 35, in <module>
    pool.map(doSNMP, work_queue, 1)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 304, in map_async
    iterable = list(iterable)
TypeError: 'Queue' object is not iterable

如果有人能提供任何意见,我将非常感激!

1 个答案:

答案 0 :(得分:0)

错误:队列不可迭代

错误跟踪在问题上非常明确。这对我来说甚至是惊喜,但Queue真的不可迭代。

修改后的测试代码

Queue主要用于在进程之间传递项目。但是你的代码只需要做一些事情并从中读取一些属性,所以没有太多可以沟通的内容。

因此我修改了你的代码。

  • 代码中定义的对象类
  • work_queue被可迭代的东西取代,在这种情况下它是一个生成器,能够生成任意数量的Object类型的项。

剩下pool.map的使用。它生成可迭代的任务。

from multiprocessing import Queue, Pool
import time

class Obj(object):
    def __init__(self, name):
        self.name = name
    def getVar(self):
        return self.name


currentTime = time.time() #used to work out how long it takes for the script to complete

def jobgenerator(num):
    for i in xrange(1, num+1):
        yield Obj("obj"+str(i))

def doFunction(job):
    time.sleep(1)
    print job.getVar()

if __name__ == '__main__':
    pool = Pool(2) #Create a pool with two processes
    pool.map(doFunction, jobgenerator(10), 2)
    pool.close()
    pool.join()

print time.time() - currentTime #prints out the time taken to run script