我如何使用多处理(python)模块错误?

时间:2016-12-29 09:28:48

标签: python

有人可以帮我弄清楚为什么以下代码无法正常运行?我希望生成新的进程,因为之前的进程已经完成,但运行此代码会自动运行所有内容,即所有作业报告完成并在它们不存在时停止,并且它们的窗口也是打开的。关于为什么is_alive()在实际上为真时返回false的任何想法?

import subprocess
import sys
import multiprocessing
import time

start_on = 33 #'!'
end_on = 34
num_processors = 4;
jobs = []

def createInstance():
    global start_on, end_on, jobs
    cmd = "python scrape.py" + " " + str(start_on) + " " + str(end_on)
    print cmd
    p = multiprocessing.Process(target=processCreator(cmd))
    jobs.append(p)
    p.start()
    start_on += 1
    end_on += 1
    print "length of jobs is: " + str(len(jobs))

def processCreator(cmd):
    subprocess.Popen(cmd, creationflags=subprocess.CREATE_NEW_CONSOLE)

if __name__ == '__main__':
    num_processors = input("How many instances to run simultaneously?: ")
    for i in range(num_processors):
        createInstance()

    while len(jobs) > 0:
        jobs = [job for job in jobs if job.is_alive()]
        for i in range(num_processors - len(jobs)):
            createInstance()
        time.sleep(1)

    print('*** All jobs finished ***')

1 个答案:

答案 0 :(得分:0)

您的代码会在每次createInstance()来电时产生2个进程,我认为这会扰乱is_alive()来电。

p = multiprocessing.Process(target=processCreator(cmd))

这将生成1个进程来运行processCreator(cmd)。然后,subprocess.Popen(cmd, creationflags=subprocess.CREATE_NEW_CONSOLE)将生成一个子进程来运行该命令。此子流程将立即返回,因此父流程。

我认为这个版本可行,删除multiprocess的用法。我也更改了cmd定义(see docs):

import subprocess
import sys
import time

start_on = 33 #'!'
end_on = 34
num_processors = 4;
jobs = []

def createInstance():
    global start_on, end_on, jobs
    cmd = ["python","scrape.py", str(start_on), str(end_on)]
    print(str(cmd))
    p = subprocess.Popen(cmd, creationflags=subprocess.CREATE_NEW_CONSOLE)
    jobs.append(p)
    p.start()
    start_on += 1
    end_on += 1
    print "length of jobs is: " + str(len(jobs))

if __name__ == '__main__':
    num_processors = input("How many instances to run simultaneously?: ")
    for i in range(num_processors):
        createInstance()

    while len(jobs) > 0:
        jobs = [job for job in jobs if job.poll() is None]
        for i in range(num_processors - len(jobs)):
            createInstance()
        time.sleep(1)

    print('*** All jobs finished ***')