Python多处理:在变为空之前填充多处理队列

时间:2016-08-03 16:36:35

标签: python queue multiprocessing

我试图在Python 2.7中创建一个多处理队列,用进程来填充它的maxsize,然后有更多的进程要完成但还没有被放入队列,当任何当前的过程完成时,将重新填充队列。我试图最大限度地提高性能,因此队列的大小是PC上的numCores,因此每个核心始终都在工作(理想情况下,CPU将在整个时间内100%使用)。我也试图避免上下文切换,这就是为什么我在任何时候都只想在队列中使用这么多。

例如,要完成50个任务,CPU有4个核心,因此队列将是maxsize 4.我们首先用4个进程填充Queue,然后立即完成4个任务(在在队列中将有3个时间,生成一个新的proc并发送到队列。它会继续执行此操作,直到生成并完成所有50个任务。

由于我是多处理的新手,这项任务很难实现,而且似乎join()函数对我来说也不起作用,因为它会强制阻塞语句,直到队列中的所有过程都完成为止,这不是我想要的。

这是我现在的代码:

def queuePut(q, thread):
    q.put(thread)


def launchThreads(threadList, performanceTestList, resultsPath, cofluentExeName):
    numThreads = len(threadList)
    threadsLeft = numThreads
    print "numThreads: " + str(numThreads)
    cpuCount = multiprocessing.cpu_count()
    q = multiprocessing.Queue(maxsize=cpuCount) 
        count = 0
    while count != numThreads:
        while not q.full():
            thread = threadList[numThreads - threadsLeft]
            p = multiprocessing.Process(target=queuePut, args=(q,thread))
            print "Starting thread " + str(numThreads - threadsLeft)
            p.start()
            threadsLeft-=1
            count +=1
        if(threadsLeft == 0):
            threadsLeft+=1
            break

以下是代码中调用的地方:

for i in testNames:
            p = multiprocessing.Process(target=worker,args=(i,paths[0],cofluentExeName,))
            jobs.append(p)

launchThreads(jobs, testNames, testDirectory, cofluentExeName)

procs似乎被创建并放入队列中,例如有12个任务和40个核心,输出如下,继续下面的错误:

numThreads: 12
Starting thread 0
Starting thread 1
Starting thread 2
Starting thread 3
Starting thread 4
Starting thread 5
Starting thread 6
Starting thread 7
Starting thread 8
Starting thread 9
Starting thread 10
Starting thread 11

  File "C:\Python27\lib\multiprocessing\queues.py", line 262, in _feed
    send(obj)
  File "C:\Python27\lib\multiprocessing\process.py", line 290, in __reduce__
    'Pickling an AuthenticationString object is '
TypeError: Pickling an AuthenticationString object is disallowed for security re
asons
Traceback (most recent call last):
  File "C:\Python27\lib\multiprocessing\queues.py", line 262, in _feed
    send(obj)
  File "C:\Python27\lib\multiprocessing\process.py", line 290, in __reduce__
    'Pickling an AuthenticationString object is '
TTypeError: Pickling an AuthenticationString object is disallowed for security r
easons
raceback (most recent call last):
  File "C:\Python27\lib\multiprocessing\queues.py", line 262, in _feed
    send(obj)
  File "C:\Python27\lib\multiprocessing\process.py", line 290, in __reduce__
    'Pickling an AuthenticationString object is '
TTypeError: Pickling an AuthenticationString object is disallowed for security r
easons
raceback (most recent call last):
  File "C:\Python27\lib\multiprocessing\queues.py", line 262, in _feed
    send(obj)
  File "C:\Python27\lib\multiprocessing\process.py", line 290, in __reduce__
    'Pickling an AuthenticationString object is '
TypeError: Pickling an AuthenticationString object is disallowed for security re
asons

1 个答案:

答案 0 :(得分:3)

为什么不使用多处理池来完成此任务?

import multiprocessing
pool = multiprocessing.Pool()
pool.map(your_function, dataset) ##dataset is a list; could be other iterable object
pool.close()
pool.join()

multiprocessing.Pool()可以使用参数processes=#来指定您要启动的作业数。如果不指定此参数,它将启动与核心一样多的作业(因此,如果您有4个核心,则有4个作业)。当一个工作完成后,它将自动启动下一个工作;你不必管理它。

多处理:https://docs.python.org/2/library/multiprocessing.html