Question

我正在尝试使用python自动执行一些大数据文件处理。

处理的一个lop被链接，即script1写一个文件，然后由script2处理，然后由script3等输出script2。

我在线程上下文中使用子进程模块。

我有一个类创建链式脚本元组（ “scr1.sh”， “scr2.sh”， “scr3.sh”）。

然后另一个使用类似

的调用的类

for script in scriplist:
    subprocess.call(script)

我的问题是，在这个for循环中，每个脚本只在subprocess.call（script1）返回成功的retcode后调用吗？

或者是因为我使用subprocess.call而不使用“sleep”或“wait”，所有三个都被一个接一个地调用，我想确保第二个脚本仅在第一个脚本结束后才开始

编辑：pydoc说 “subprocess.call（* popenargs，** kwargs）使用参数运行命令。等待命令完成，然后返回returncode属性。“

所以在for循环（上面）中，它是否在迭代到下一个脚本之前等待每个retcode。

我是线程新手。我附加了运行分析的类的精简代码。 subprocess.call循环是该类的一部分。

class ThreadedDataProcessor(Thread):
            def __init__(self, in_queue, out_queue):
                # Uses Queue 
                Thread.__init__(self)
                self.in_queue = in_queue
                self.out_queue = out_queue
            def run(self):
                while True:
                    path = self.in_queue.get()
                    if path is None:
                        break
                    myprocessor = ProcessorScriptCreator(path)
                    scrfiles = myprocessor.create_and_return_shell_scripts()

                for index,file in enumerate(scrfiles):
                    subprocess.call([file])
                    print "CALLED%s%s" % (index,file) *5
                #report(myfile.describe())
                #report("Done %s" %  path)
                self.out_queue.put(path) 
                in_queue = Queue()

Answer 1

循环将连续调用每个脚本，等待它完成，然后调用下一个脚本，无论前一个调用是成功还是失败。你可能想说：

try:
  map(subprocess.check_call, script_list)
except Exception, e:
  # failed script

新线程将在每次调用run时开始，并在run完成时结束。您可以在一个线程内使用子进程迭代脚本。

您应该确保每个线程中的每组调用都不会影响来自其他线程的其他调用。例如，尝试同时从多个线程中的脚本调用读取和写入同一文件。

混淆py循环中的python子进程

1 个答案: