Question

我使用open4 gem将系统调用包装到可能长时间运行的第三方命令行工具中。该工具有时可能会失败，导致两个进程忙，并且部分阻塞管道，因为父进程是工作程序脚本池（服务Beanstalk队列）的一部分。从系统外部，我可以根据正在处理的数据模型以编程方式识别卡住的工作程序脚本及其进程ID。在Open4.open4块内，我可以识别子进程ID。

我想设置Open4块，以便在将SIGTERM发送到父工作进程时，它会将SIGTERM转发给子进程。此外，如果子进程在短暂等待后仍然无法退出，我想向子进程发送SIGKILL。在这两种情况下，我都希望父进程能够正常响应它发送的SIGTERM。

这一切都已经完成，因此我可以在客户服务应用程序中公开“停止”按钮，因此非技术团队成员可以使用工具来管理阻塞队列的情况。

我在SO中发现了一些相关问题 - 例如How to make child process die after parent exits? - 但Ruby应用程序代码中的答案对我来说并不实用。

以下是我在Mac上测试的Ruby当前实现：

测试“坏”进程的替身，该进程并不总是响应SIGTERM：

# Writing to a log file shows whether or not a detached process continues
# once the parent has closed IO to it.
$f = open( 'log.txt', 'w' );

def say m
  begin
    $f.puts m
    $f.flush
    $stderr.puts m
  rescue Exception => e
    # When the parent process closes, we get
    # #<Errno::EPIPE: Broken pipe - <STDERR>> in this
    # test, but with a stuck child process, this is not 
    # guaranteed to happen or cause the child to exit.
    $f.puts e.inspect
    $f.flush
  end
end

Signal.trap( "TERM" ) { say "Received and ignored TERM" }

# Messages get logged, and sleep allows test of manual interrupts
say "Hello"
sleep 3
say "Foo Bar Baz"
sleep 3
say "Doo Be Doo"
sleep 3
say "Goodbye" 
$f.close

测试Open4块（“worker”测试脚本的一部分）：

Open4.open4(@command) do | pid, stdin, stdout, stderr |
  begin
    stderr.each { |l|
      puts "[#{pid}] STDERR: #{l}" if l
    }
  rescue SignalException => e
    puts "[#{$$}] Received signal (#{e.signo} #{e.signm}) in Open4 block"

    # Forward a SIGTERM to child, upgrade to SIGKILL if it doesn't work
    if e.signo == 15
      begin
        puts "[#{$$}] Sending TERM to child process"
        Process.kill( 'TERM', pid )
        timeout(3.0) { Process.waitpid( pid ) }
      rescue Timeout::Error
        puts "[#{$$}] Sending KILL to child process"
        Process.kill( 'KILL', pid )
      end
    end

    raise e
  end
end

我启动时的典型输出，然后运行，例如kill -15 16854：

[16855] STDERR: Hello
[16854] Received signal (15 SIGTERM) in Open4 block
[16854] Sending TERM to child process
[16854] Sending KILL to child process

同一测试的日志文件内容：

Hello
Received and ignored TERM
Foo Bar Baz

代码是IMO有点笨拙，虽然它看起来像我想要的那样工作。我的问题：

上述尝试是否正常，或者在我需要的用例中是否存在致命缺陷？
我是否错过了使用现有Open4和核心Ruby方法做同样事情的更简洁方法？

如何将终止信号转发到Open4子进程

0 个答案: