Question

在Ruby中，我使用Process.spawn在新进程中运行命令。我已经打开了一个双向管道来从生成的进程中捕获stdout和stderr。这很有效，直到写入管道的字节（来自命令的stdout）超过64Kb，此时命令永远不会完成。我认为管道缓冲区大小已被命中，并且管道的写入现在被阻止，导致进程永远不会完成。在我的实际应用程序中，我正在运行一个具有大量stdout的长命令，我需要捕获并在进程完成时保存。有没有办法提高缓冲区大小，或者更好的是刷新缓冲区以便不会达到限制？

cmd = "for i in {1..6600}; do echo '123456789'; done"  #works fine at 6500 iterations.

pipe_cmd_in, pipe_cmd_out = IO.pipe
cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)

Process.wait(cmd_pid)
pipe_cmd_out.close
out = pipe_cmd_in.read
puts "child: cmd out length = #{out.length}"

更新 Open3 :: capture2e似乎适用于我展示的简单示例。不幸的是，对于我的实际应用程序，我需要能够获得生成进程的pid，并控制何时阻止执行。一般的想法是我分叉一个非阻塞过程。在这个fork中，我生成了一个命令。我将命令pid发送回父进程，然后我等待命令完成以获得退出状态。命令完成后，退出状态将发送回父级。在父级中，循环每1秒迭代一次，检查数据库是否有暂停和恢复等控制操作。如果它获得控制动作，它会将相应的信号发送到命令pid以停止，继续。当命令最终完成时，父命中命中救援块并读取退出状态管道，并保存到DB。这是我的实际流程：

#pipes for communicating the command pid, and exit status from child to parent
pipe_parent_in, pipe_child_out = IO.pipe
pipe_exitstatus_read, pipe_exitstatus_write = IO.pipe

child_pid = fork do
    pipe_cmd_in, pipe_cmd_out = IO.pipe
    cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)
    pipe_child_out.write cmd_pid  #send command pid to parent
    pipe_child_out.close
    Process.wait(cmd_pid)
    exitstatus = $?.exitstatus
    pipe_exitstatus_write.write exitstatus  #send exitstatus to parent
    pipe_exitstatus_write.close
    pipe_cmd_out.close
    out = pipe_cmd_in.read
    #save out to DB
end

#blocking read to get the command pid from the child
pipe_child_out.close
cmd_pid = pipe_parent_in.read.to_i

loop do
    begin
        Process.getpgid(cmd_pid)  #when command is done, this will except
        @job.reload #refresh from DB

        #based on status in the DB, pause / resume command
        if @job.status == 'pausing'
            Process.kill('SIGSTOP', cmd_pid)
        elsif @job.status == 'resuming'
            Process.kill('SIGCONT', cmd_pid)
        end
    rescue
        #command is no longer running
        pipe_exitstatus_write.close
        exitstatus = pipe_exitstatus_read.read
        #save exit status to DB
        break
    end
    sleep 1
end

注意：我不能让父轮询命令输出管道，因为父进程将被阻塞，等待管道关闭。它无法通过控制循环暂停和恢复命令。

Answer 1

此代码似乎可以执行您想要的操作，并且可能是说明性的。

cmd = "for i in {1..6600}; do echo '123456789'; done"

pipe_cmd_in, pipe_cmd_out = IO.pipe
cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)

@exitstatus = :not_done
Thread.new do
  Process.wait(cmd_pid); 
  @exitstatus = $?.exitstatus
end

pipe_cmd_out.close
out = pipe_cmd_in.read;
sleep(0.1) while @exitstatus == :not_done
puts "child: cmd out length = #{out.length}; Exit status: #{@exitstatus}"

通常，在线程之间共享数据（@exitstatus）需要更多的关注，但它可以工作这里因为它只是在初始化后由线程写入一次。（事实证明 $ ?. exitstatus可以返回nil，这就是我将其初始化为其他内容的原因。）调用 sleep（）不太可能执行一次，因为它上面的read（）将无法完成直到产生的进程关闭它的标准输出。

Answer 2

确实，您的诊断可能是正确的。您可以在等待进程结束时在管道上实现选择和读取循环，但是您可以使用stdlib Open3::capture2e更简单地获得所需内容。

ruby Process.spawn stdout =＆gt;管道缓冲区大小限制

2 个答案: