您好,我有一条数据处理管道,我想通过在CPU上同时运行一些处理线程,同时在GPU上运行MXNet预测模型(Python 3.6)来对其进行优化。
我想应用的想法如下(假设我的机器上有N个GPU):
这是工作流程的直观描述:
想法是在GPU忙于处理帧时利用空闲的CPU。
通过使用线程库,我成功读取和处理了前N个帧,但是GPU无法处理下一批帧。
请注意,下面的源代码经过简化以阐明工作流程。
这是读取帧并将其分派到GPU,然后将输出发送到CPU队列的函数代码:
def dispatch_jobs(video_capture, detection_workers, number_of_gpu, cpu_queue):
# detection_workers is a list of N similar MXNet models, each one works on a different GPU
is_last_frame = False
while not is_last_frame:
frames_batch = []
for i in range(0, number_of_gpu):
success, frame = read_frame_from_video(video_capture)
if not success:
logging.warning("Can't receive frame. Exiting.")
is_last_frame = True
break
frames_batch.append(frame)
workers = []
for detection_worker_id in range(0, len(frames_batch)):
frame_image = frames_batch[detection_worker_id]
thread = Thread(target=detection_workers[detection_worker_id].predict, kwargs={'image': frame_image})
workers.append(thread)
for w in workers: w.start()
for w in workers: w.join()
# sending to the CPU queue
for detection_worker_id in range(0, len(frames_batch)):
detector_output = detection_workers[detection_worker_id].output
cpu_queue.put(detector_output)
logging.info("While loop is broken... putting -1 in the queue")
cpu_queue.put(-1)
return
如上所述,有一个使用者线程从cpu_queue
读取输出,并将其发送到多线程函数(在CPU上),这是使用者函数的代码:
def consume_cpu_queue(cpu_queue):
while cpu_queue.empty():
logging.info("Sleeping 1 second")
time.sleep(1)
prediction_output = cpu_queue.get()
if prediction_output == -1:
return
process_output_multithread(prediction_output)
consume_cpu_queue()
def process_output_multithread(pred_output, number_of_process):
workers = []
for i in range(0, number_of_process):
thread = Thread(target=process, kwargs={'pred_output': pred_output})
workers.append(thread)
for w in workers: w.start()
for w in workers: w.join()
return
# Here is how the consumer thread is initiated
cpu_consumer_thread = Thread(target=consume_cpu_queue)
# Here is how I run the application
cpu_consumer_thread.start()
dispatch_jobs(video_capture, detection_workers)
cpu_consumer_thread.join()
我已经检查过this question,但是不确定Numba是否可以解决我的问题。
任何建议或指示都会很有帮助。