Question

我正在Linux上运行应用程序foo。在Bash脚本/终端提示符下，我的应用程序使用以下命令运行多线程：

$ foo -config x.ini -threads 4 < inputfile

System Monitor和top report foo平均约为380％的CPU负载（四核机器）。我在Python 2.6x中使用：

重新创建了这个功能

proc = subprocess.Popen("foo -config x.ini -threads 4", \
        shell=True, stdin=subprocess.PIPE, \
        stdout=subprocess.PIPE, stderr=subprocess.PIPE)
mylist = ['this','is','my','test','app','.']
for line in mylist:
    txterr = ''
    proc.stdin.write(line.strip()+'\n')
    while not proc.poll() and not txterr.count('Finished'):
        txterr += subproc.stderr.readline()
    print proc.stdout.readline().strip(),

Foo运行较慢，top报告CPU负载为100％。 Foo也运行良好，shell = False，但仍然很慢：

proc = subprocess.Popen("foo -config x.ini -threads 4".split(), \
        shell=False, stdin=subprocess.PIPE, \
        stdout=subprocess.PIPE, stderr=subprocess.PIPE)

有没有办法让Python子进程连续填充所有线程？

Answer 1

当你像这样用Popen调用命令时，无论是从Python还是从shell调用它都无关紧要。这是启动它的进程的“foo”命令，而不是Python。

所以答案是“是的，从Python调用时，子进程可以是多线程的。”

Answer 2

首先，您是猜测它是单线程的，因为它使用100％的CPU而不是400％？

最好使用top程序检查已启动的线程数，点击H键显示线程。或者，使用ps -eLf并确保NLWP列显示多个主题。

Linux与CPU亲和力相比可能非常抽象;默认情况下，调度程序不会将进程从其使用的最后一个处理器移开。这意味着，如果程序的所有四个线程都是在一个处理器上启动的，那么它们将共享处理器FOR EVER。您必须使用像taskset(1)这样的工具来强制对必须在不同处理器上运行很长时间的进程的CPU关联。例如，taskset -p <pid1> -c 0 ; taskset -p <pid2> -c 1 ; taskset -p <pid3> -c 2 ; taskset -p <pid4> -c 3。

您可以使用taskset -p <pid>检索亲和力，以找出当前设置的亲和力。

（有一天，我想知道为什么我的Folding At Home进程使用的时间远远少于我预期的CPU时间，我发现血腥的调度程序在一个HyperThread兄弟上放置了三个FaH任务，在另一个HT兄弟上放置了第四个FaH任务< em>在同一个核心上。其他三个处理器都处于空闲状态。（第一个核心也运行得非常热，其他三个核心处于四度或五度冷态。嘿。）

Answer 3

如果您的python脚本没有足够快地提供foo进程，那么您可以将stdout，stderr卸载到线程：

from Queue import Empty, Queue
from subprocess import PIPE, Popen
from threading import Thread

def start_thread(target, *args):
    t = Thread(target=target, args=args)
    t.daemon = True
    t.start()
    return t

def signal_completion(queue, stderr):
    for line in iter(stderr.readline, ''):
        if 'Finished' in line:
           queue.put(1) # signal completion
    stderr.close()

def print_stdout(q, stdout):
    """Print stdout upon receiving a signal."""
    text = []
    for line in iter(stdout.readline, ''):
        if not q.empty():
           try: q.get_nowait()               
           except Empty:
               text.append(line) # queue is empty
           else: # received completion signal              
               print ''.join(text),
               text = []
               q.task_done()
        else: # buffer stdout until the task is finished
            text.append(line)
    stdout.close()
    if text: print ''.join(text), # print the rest unconditionally

queue = Queue()
proc = Popen("foo -config x.ini -threads 4".split(), bufsize=1,
             stdin=PIPE, stdout=PIPE, stderr=PIPE)
threads =  [start_thread(print_stdout, queue, proc.stdout)]
threads += [start_thread(signal_completion, queue, proc.stderr)]

mylist = ['this','is','my','test','app','.']
for line in mylist:
    proc.stdin.write(line.strip()+'\n')
proc.stdin.close()
proc.wait()
for t in threads: t.join() # wait for stdout

Python如何不断填充子进程的多个线程？

3 个答案: