我无法在不等待终止的情况下同时启动多个进程。
我正在遍历目录,然后在外部脚本中处理文件内容。
命令行执行如下:
python process.py < /dir/file
以下是一些python代码示例
for root, directory, file in os.walk(dir):
for name in file:
input_file = open(os.path.join(root, name))
input_text = input_file.read().encode('utf-8')
input_file.close()
command = "python process.py"
process = subprocess.Popen(command.split(), shell=False, stdin=subprocess.PIPE)
process.stdin.write(input_text)
log.debug("Process started with pid {0}".format(process.pid))
process.communicate()
有没有办法在不等待终止的情况下启动它们?
答案 0 :(得分:2)
是。将它们存储在列表中,不要在循环中使用process.communicate()
。它会阻止。
来自文档:
与流程交互:将数据发送到stdin。从stdout和stderr读取数据,直到达到文件结尾。 等待进程终止。可选的输入参数应该是要发送到子进程的字符串,如果没有数据应该发送给子进程,则为None。
所以结果应该是这样的:
# list to store processes after creating them
prcoesses = list()
for root, directory, file in os.walk(dir):
for name in file:
input_file = open(os.path.join(root, name))
input_text = input_file.read().encode('utf-8')
input_file.close()
command = "python process.py"
process = subprocess.Popen(command.split(),
shell=False,
stdin=subprocess.PIPE)
processes.append(process)
process.stdin.write(input_text)
log.debug("Process started with pid {0}".format(process.pid))
# process.communicate()
# wait for processes to complete
for process in processes:
stdoutdata, stderrdata = process.communicate()
# ... do something with data returned from process
要使用有限数量的流程,您可能需要使用process pool模块中提供的multiprocessing
。