Question

我继承了进入繁忙循环的代码，读取子进程的输出以查找关键字，但我希望它能以较低的开销工作。代码如下：

def stdout_search(self, file, keyword)
    s = ''
    while True:
        c = file.read(1)
        if not c:
            return None
        if c != '\r' and c != '\n':
            s += c
            continue
        s = s.strip()
        if keyword in s:
            break
        s = ''
    i = s.find(keyword) + len(keyword)
    return s[i:]

def scan_output(self, file, ev)
    while not ev.wait(0):
        s = self.stdout_search(file, 'Keyword:')
        if not s:
            break
        # Do something useful with s
        offset = #calculate offset
        wx.CallAfter(self.offset_label.SetLabel offset)
        #time.sleep(0.03)

Popen ed过程的输出类似于：

Keyword: 1 of 100
Keyword: 2 of 100
...etc...

在time.sleep(0.03)结束时取消注释scan_output会将单个核心的负载从100％降低到可接受的25％左右，但不幸的是，偏移标签重绘断断续续，尽管我是从30 fps的播放中读取帧数，标签通常每秒更新不到一次。如何使用更正确的等待输入来实现此代码？

BTW，完整代码may be found here。

Answer 1

一次读取一个字节效率很低。请参阅Reading binary file in Python and looping over each byte。

如果您不需要立即反馈; use Popen.communicate() to get all output一次。{/ p>

为避免冻结GUI，您可以put IO into a background thread。对于支持增量读取的阻塞IO，它是一个简单的可移植选项。

要在子进程刷新后立即处理输出，您可以使用异步I / O，如Tkinter的createfilehandler()，Gtk的io_add_watch()等 - 您提供回调和当下一个数据块准备就绪时，GUI会调用它。

如果孩子经常刷新数据;回调可能只是读取块并将其放入缓冲区，然后您可以使用Tkinter's widget.after(), Gtk's GObject.timeout_add()每隔X秒处理缓冲区，或者在线路上达到特定大小或某个数字等时处理缓冲区。

要阅读'Keyword:'，您可以使用与asyncio's readuntil()类似的代码。另请参阅How to read records terminated by custom separator from file in python?

从popen句柄读取的低开销方法

1 个答案: