我在尝试理解多处理队列如何在python上运行以及如何实现它时遇到了很多麻烦。假设我有两个从共享文件访问数据的python模块,让我们将这两个模块称为编写者和读者。我的计划是让读者和写者将请求分成两个独立的多处理队列,然后让第三个进程在循环中弹出这些请求并按原样执行。
我的主要问题是我真的不知道如何正确实现multiprocessing.queue,你不能真正实例化每个进程的对象,因为它们将是独立的队列,你如何确保所有进程都与共享相关队列(或者在这种情况下,队列)
答案 0 :(得分:80)
我的主要问题是我真的不知道如何正确实现multiprocessing.queue,你不能真正实例化每个进程的对象,因为它们将是独立的队列,你如何确保所有进程都与共享队列(或者在这种情况下,队列)
这是一个简单的读写器共享单个队列的例子......作者向读者发送了一堆整数;当作者用完数字时,它会发送“DONE'”,让读者知道如何摆脱读取循环。
from multiprocessing import Process, Queue
import time
import sys
def reader_proc(queue):
## Read from the queue; this will be spawned as a separate Process
while True:
msg = queue.get() # Read from the queue and do nothing
if (msg == 'DONE'):
break
def writer(count, queue):
## Write to the queue
for ii in range(0, count):
queue.put(ii) # Write 'count' numbers into the queue
queue.put('DONE')
if __name__=='__main__':
pqueue = Queue() # writer() writes to pqueue from _this_ process
for count in [10**4, 10**5, 10**6]:
### reader_proc() reads from pqueue as a separate process
reader_p = Process(target=reader_proc, args=((pqueue),))
reader_p.daemon = True
reader_p.start() # Launch reader_proc() as a separate python process
_start = time.time()
writer(count, pqueue) # Send a lot of stuff to reader()
reader_p.join() # Wait for the reader to finish
print("Sending {0} numbers to Queue() took {1} seconds".format(count,
(time.time() - _start)))
答案 1 :(得分:7)
在“from queue import Queue
”中没有名为queue
的模块,而应使用multiprocessing
。因此,它应该看起来像“from multiprocessing import Queue
”
答案 2 :(得分:6)
在尝试建立一种使用队列传递大熊猫数据帧的多重处理方式时,我看了堆栈溢出和网络上的多个答案。在我看来,每个答案都在重复相同的解决方案,而没有考虑众多极端情况,在进行此类计算时肯定会遇到这种情况。问题在于同时有很多事情在发生。任务数,工作人员数,每个任务的持续时间以及任务执行期间可能出现的异常。所有这些都使同步变得棘手,大多数答案都无法解决您如何进行同步。所以这是我经过几个小时的学习后所希望的,希望这对于大多数人来说足够通用,以至于觉得有用。
任何编码示例之前的一些想法。由于queue.Empty
或queue.qsize()
或任何其他类似方法对于流量控制都不可靠,因此类似的任何代码
while True:
try:
task = pending_queue.get_nowait()
except queue.Empty:
break
是假的。即使几毫秒后队列中出现另一个任务,这也会杀死该工作程序。工作人员将无法恢复,一段时间后,所有工作人员都将消失,因为他们随机发现队列暂时空了。最终结果将是在没有完成所有任务的情况下返回主多处理函数(在进程上具有join()的函数)。真好如果您有成千上万的任务,而有一些任务缺失,那么请通过调试来进行好运。
另一个问题是哨兵值的使用。许多人建议在队列中添加一个哨兵值以标记队列的结束。但是要确切地标记给谁?如果有N个工作程序,则假设N是可用的给定或获取的内核数,则单个标记值将仅将队列结束标记为一个工作程序。当剩下的工人都剩无余时,其他所有工人将坐在那里等待更多的工作。我见过的典型例子是
while True:
task = pending_queue.get()
if task == SOME_SENTINEL_VALUE:
break
一名工人将获得前哨值,其余工人将无限期等待。我没有碰到任何帖子提到您需要将最少数量的哨兵值提交给队列,以便所有人都能得到。
另一个问题是任务执行期间的异常处理。同样,这些应该被捕获和管理。此外,如果您有completed_tasks
队列,则在确定完成工作之前,应以确定性方式独立计算队列中有多少项。再次依赖队列大小注定会失败并返回意外结果。
在下面的示例中,par_proc()
函数将收到任务列表,包括将与这些任务一起使用的函数以及任何命名的参数和值。
import multiprocessing as mp
import dill as pickle
import queue
import time
import psutil
SENTINEL = None
def do_work(tasks_pending, tasks_completed):
# Get the current worker's name
worker_name = mp.current_process().name
while True:
try:
task = tasks_pending.get_nowait()
except queue.Empty:
print(worker_name + ' found an empty queue. Sleeping for a while before checking again...')
time.sleep(0.01)
else:
try:
if task == SENTINEL:
print(worker_name + ' no more work left to be done. Exiting...')
break
print(worker_name + ' received some work... ')
time_start = time.perf_counter()
work_func = pickle.loads(task['func'])
result = work_func(**task['task'])
tasks_completed.put({work_func.__name__: result})
time_end = time.perf_counter() - time_start
print(worker_name + ' done in {} seconds'.format(round(time_end, 5)))
except Exception as e:
print(worker_name + ' task failed. ' + str(e))
tasks_completed.put({work_func.__name__: None})
def par_proc(job_list, num_cpus=None):
# Get the number of cores
if not num_cpus:
num_cpus = psutil.cpu_count(logical=False)
print('* Parallel processing')
print('* Running on {} cores'.format(num_cpus))
# Set-up the queues for sending and receiving data to/from the workers
tasks_pending = mp.Queue()
tasks_completed = mp.Queue()
# Gather processes and results here
processes = []
results = []
# Count tasks
num_tasks = 0
# Add the tasks to the queue
for job in job_list:
for task in job['tasks']:
expanded_job = {}
num_tasks = num_tasks + 1
expanded_job.update({'func': pickle.dumps(job['func'])})
expanded_job.update({'task': task})
tasks_pending.put(expanded_job)
# Use as many workers as there are cores (usually chokes the system so better use less)
num_workers = num_cpus
# We need as many sentinels as there are worker processes so that ALL processes exit when there is no more
# work left to be done.
for c in range(num_workers):
tasks_pending.put(SENTINEL)
print('* Number of tasks: {}'.format(num_tasks))
# Set-up and start the workers
for c in range(num_workers):
p = mp.Process(target=do_work, args=(tasks_pending, tasks_completed))
p.name = 'worker' + str(c)
processes.append(p)
p.start()
# Gather the results
completed_tasks_counter = 0
while completed_tasks_counter < num_tasks:
results.append(tasks_completed.get())
completed_tasks_counter = completed_tasks_counter + 1
for p in processes:
p.join()
return results
这是对上面的代码运行的测试
def test_parallel_processing():
def heavy_duty1(arg1, arg2, arg3):
return arg1 + arg2 + arg3
def heavy_duty2(arg1, arg2, arg3):
return arg1 * arg2 * arg3
task_list = [
{'func': heavy_duty1, 'tasks': [{'arg1': 1, 'arg2': 2, 'arg3': 3}, {'arg1': 1, 'arg2': 3, 'arg3': 5}]},
{'func': heavy_duty2, 'tasks': [{'arg1': 1, 'arg2': 2, 'arg3': 3}, {'arg1': 1, 'arg2': 3, 'arg3': 5}]},
]
results = par_proc(task_list)
job1 = sum([y for x in results if 'heavy_duty1' in x.keys() for y in list(x.values())])
job2 = sum([y for x in results if 'heavy_duty2' in x.keys() for y in list(x.values())])
assert job1 == 15
assert job2 == 21
再加上一个例外
def test_parallel_processing_exceptions():
def heavy_duty1_raises(arg1, arg2, arg3):
raise ValueError('Exception raised')
return arg1 + arg2 + arg3
def heavy_duty2(arg1, arg2, arg3):
return arg1 * arg2 * arg3
task_list = [
{'func': heavy_duty1_raises, 'tasks': [{'arg1': 1, 'arg2': 2, 'arg3': 3}, {'arg1': 1, 'arg2': 3, 'arg3': 5}]},
{'func': heavy_duty2, 'tasks': [{'arg1': 1, 'arg2': 2, 'arg3': 3}, {'arg1': 1, 'arg2': 3, 'arg3': 5}]},
]
results = par_proc(task_list)
job1 = sum([y for x in results if 'heavy_duty1' in x.keys() for y in list(x.values())])
job2 = sum([y for x in results if 'heavy_duty2' in x.keys() for y in list(x.values())])
assert not job1
assert job2 == 21
希望有帮助。
答案 3 :(得分:1)
我们实现了两个版本,一个是简单的多线程池,可以执行多种类型的可调用对象,从而使我们的生活更加轻松;第二个版本使用了进程 ,这在可调用性方面较不灵活,需要和额外调用莳萝。
将Frozen_pool设置为true将冻结执行,直到在任何一个类中调用finish_pool_queue为止。
线程版本:
'''
Created on Nov 4, 2019
@author: Kevin
'''
from threading import Lock, Thread
from Queue import Queue
import traceback
from helium.loaders.loader_retailers import print_info
from time import sleep
import signal
import os
class ThreadPool(object):
def __init__(self, queue_threads, *args, **kwargs):
self.frozen_pool = kwargs.get('frozen_pool', False)
self.print_queue = kwargs.get('print_queue', True)
self.pool_results = []
self.lock = Lock()
self.queue_threads = queue_threads
self.queue = Queue()
self.threads = []
for i in range(self.queue_threads):
t = Thread(target=self.make_pool_call)
t.daemon = True
t.start()
self.threads.append(t)
def make_pool_call(self):
while True:
if self.frozen_pool:
#print '--> Queue is frozen'
sleep(1)
continue
item = self.queue.get()
if item is None:
break
call = item.get('call', None)
args = item.get('args', [])
kwargs = item.get('kwargs', {})
keep_results = item.get('keep_results', False)
try:
result = call(*args, **kwargs)
if keep_results:
self.lock.acquire()
self.pool_results.append((item, result))
self.lock.release()
except Exception as e:
self.lock.acquire()
print e
traceback.print_exc()
self.lock.release()
os.kill(os.getpid(), signal.SIGUSR1)
self.queue.task_done()
def finish_pool_queue(self):
self.frozen_pool = False
while self.queue.unfinished_tasks > 0:
if self.print_queue:
print_info('--> Thread pool... %s' % self.queue.unfinished_tasks)
sleep(5)
self.queue.join()
for i in range(self.queue_threads):
self.queue.put(None)
for t in self.threads:
t.join()
del self.threads[:]
def get_pool_results(self):
return self.pool_results
def clear_pool_results(self):
del self.pool_results[:]
进程版本:
'''
Created on Nov 4, 2019
@author: Kevin
'''
import traceback
from helium.loaders.loader_retailers import print_info
from time import sleep
import signal
import os
from multiprocessing import Queue, Process, Value, Array, JoinableQueue
from dill import dill
import ctypes
from helium.misc.utils import ignore_exception
class ProcessPool(object):
def __init__(self, queue_processes, *args, **kwargs):
self.frozen_pool = Value(ctypes.c_bool, kwargs.get('frozen_pool', False))
self.print_queue = kwargs.get('print_queue', True)
self.pool_results = Array(ctypes.c_char_p, kwargs.get('pool_result_size', 0))
self.queue_processes = queue_processes
self.queue = JoinableQueue()
self.processes = []
for i in range(self.queue_processes):
p = Process(target=self.make_pool_call)
p.start()
self.processes.append(p)
print 'Processes', self.queue_processes
def make_pool_call(self):
while True:
if self.frozen_pool.value:
#print '--> Queue is frozen'
sleep(1)
continue
item_pickled = self.queue.get()
if item_pickled is None:
print '--> Ending'
self.queue.task_done()
break
item = dill.loads(item_pickled)
call = item.get('call', None)
args = item.get('args', [])
kwargs = item.get('kwargs', {})
keep_results = item.get('keep_results', False)
try:
result = call(*args, **kwargs)
if keep_results:
self.pool_results.append((item, result))
except Exception as e:
print e
traceback.print_exc()
os.kill(os.getpid(), signal.SIGUSR1)
self.queue.task_done()
def finish_pool_queue(self):
self.frozen_pool.value = False
while self.queue.qsize() > 0:
if self.print_queue:
print_info('--> Process pool... %s' % self.queue.qsize())
sleep(5)
for i in range(self.queue_processes):
self.queue.put(None)
self.queue.join()
self.queue.close()
for p in self.processes:
with ignore_exception: p.join(15)
with ignore_exception: del self.processes[:]
def get_pool_results(self):
return self.pool_results
def clear_pool_results(self):
del self.pool_results[:]
def test(eg): print 'EG', eg
致电:
tp = ThreadPool(queue_threads=2)
tp.queue.put({'call': test, 'args': [random.randint(0, 100)]})
tp.finish_pool_queue()
或
pp = ProcessPool(queue_processes=2)
pp.queue.put(dill.dumps({'call': test, 'args': [random.randint(0, 100)]}))
pp.queue.put(dill.dumps({'call': test, 'args': [random.randint(0, 100)]}))
pp.finish_pool_queue()
答案 4 :(得分:1)
一个多生产者和多消费者的例子,经过验证。应该很容易修改它以涵盖其他情况,单/多生产者,单/多消费者。
from multiprocessing import Process, JoinableQueue
import time
import os
q = JoinableQueue()
def producer():
for item in range(30):
time.sleep(2)
q.put(item)
pid = os.getpid()
print(f'producer {pid} done')
def worker():
while True:
item = q.get()
pid = os.getpid()
print(f'pid {pid} Working on {item}')
print(f'pid {pid} Finished {item}')
q.task_done()
for i in range(5):
p = Process(target=worker, daemon=True).start()
# send thirty task requests to the worker
producers = []
for i in range(2):
p = Process(target=producer)
producers.append(p)
p.start()
# make sure producers done
for p in producers:
p.join()
# block until all workers are done
q.join()
print('All work completed')
说明:
答案 5 :(得分:0)
这是multiprocessing.Queue
和multiprocessing.Process
的简单用法,它允许调用者向单独的进程发送“事件”和参数,该单独的进程将事件分派给该进程的“ do_”方法。 (Python 3.4及更高版本)
import multiprocessing as mp
import collections
Msg = collections.namedtuple('Msg', ['event', 'args'])
class BaseProcess(mp.Process):
"""A process backed by an internal queue for simple one-way message passing.
"""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.queue = mp.Queue()
def send(self, event, *args):
"""Puts the event and args as a `Msg` on the queue
"""
msg = Msg(event, args)
self.queue.put(msg)
def dispatch(self, msg):
event, args = msg
handler = getattr(self, "do_%s" % event, None)
if not handler:
raise NotImplementedError("Process has no handler for [%s]" % event)
handler(*args)
def run(self):
while True:
msg = self.queue.get()
self.dispatch(msg)
用法:
class MyProcess(BaseProcess):
def do_helloworld(self, arg1, arg2):
print(arg1, arg2)
if __name__ == "__main__":
process = MyProcess()
process.start()
process.send('helloworld', 'hello', 'world')
send
发生在父进程中,do_*
发生在子进程中。
我忽略了任何显然会中断运行循环并退出子进程的异常处理。您还可以通过覆盖run
来自定义它,以控制阻止或其他操作。
这实际上仅在您具有单个工作进程的情况下有用,但是我认为这是展示具有更多面向对象的常见方案的一个相关答案。
答案 6 :(得分:0)
仅举了一个简单而通用的示例,演示了如何在2个独立程序之间通过Queue传递消息。它不会直接回答OP的问题,但应该足够清楚地表明概念。
服务器:
multiprocessing-queue-manager-server.py
import asyncio
import concurrent.futures
import multiprocessing
import multiprocessing.managers
import queue
import sys
import threading
from typing import Any, AnyStr, Dict, Union
class QueueManager(multiprocessing.managers.BaseManager):
def get_queue(self, ident: Union[AnyStr, int, type(None)] = None) -> multiprocessing.Queue:
pass
def get_queue(ident: Union[AnyStr, int, type(None)] = None) -> multiprocessing.Queue:
global q
if not ident in q:
q[ident] = multiprocessing.Queue()
return q[ident]
q: Dict[Union[AnyStr, int, type(None)], multiprocessing.Queue] = dict()
delattr(QueueManager, 'get_queue')
def init_queue_manager_server():
if not hasattr(QueueManager, 'get_queue'):
QueueManager.register('get_queue', get_queue)
def serve(no: int, term_ev: threading.Event):
manager: QueueManager
with QueueManager(authkey=QueueManager.__name__.encode()) as manager:
print(f"Server address {no}: {manager.address}")
while not term_ev.is_set():
try:
item: Any = manager.get_queue().get(timeout=0.1)
print(f"Client {no}: {item} from {manager.address}")
except queue.Empty:
continue
async def main(n: int):
init_queue_manager_server()
term_ev: threading.Event = threading.Event()
executor: concurrent.futures.ThreadPoolExecutor = concurrent.futures.ThreadPoolExecutor()
i: int
for i in range(n):
asyncio.ensure_future(asyncio.get_running_loop().run_in_executor(executor, serve, i, term_ev))
# Gracefully shut down
try:
await asyncio.get_running_loop().create_future()
except asyncio.CancelledError:
term_ev.set()
executor.shutdown()
raise
if __name__ == '__main__':
asyncio.run(main(int(sys.argv[1])))
客户:
multiprocessing-queue-manager-client.py
import multiprocessing
import multiprocessing.managers
import os
import sys
from typing import AnyStr, Union
class QueueManager(multiprocessing.managers.BaseManager):
def get_queue(self, ident: Union[AnyStr, int, type(None)] = None) -> multiprocessing.Queue:
pass
delattr(QueueManager, 'get_queue')
def init_queue_manager_client():
if not hasattr(QueueManager, 'get_queue'):
QueueManager.register('get_queue')
def main():
init_queue_manager_client()
manager: QueueManager = QueueManager(sys.argv[1], authkey=QueueManager.__name__.encode())
manager.connect()
message = f"A message from {os.getpid()}"
print(f"Message to send: {message}")
manager.get_queue().put(message)
if __name__ == '__main__':
main()
用法
服务器:
$ python3 multiprocessing-queue-manager-server.py N
N
是一个整数,指示应创建多少个服务器。复制服务器输出的<server-address-N>
之一,并将其作为每个multiprocessing-queue-manager-client.py
的第一个参数。
客户:
python3 multiprocessing-queue-manager-client.py <server-address-1>
结果
服务器:
Client 1: <item> from <server-address-1>
要点:https://gist.github.com/89062d639e40110c61c2f88018a8b0e5
UPD :创建了一个程序包here。
服务器:
import ipcq
with ipcq.QueueManagerServer(address=ipcq.Address.DEFAULT, authkey=ipcq.AuthKey.DEFAULT) as server:
server.get_queue().get()
客户:
import ipcq
client = ipcq.QueueManagerClient(address=ipcq.Address.DEFAULT, authkey=ipcq.AuthKey.DEFAULT)
client.get_queue().put('a message')