Question

我正在使用Ruby Net :: SSH到SSH监视的服务器，通过channel.exec在这些服务器上运行命令，然后收集命令输出作为监视数据。有时，服务器将出现故障（例如，硬盘驱动器发生故障），导致其无限期挂起，而不响应Ruby发送的命令。我已经通过终端控制台直接向行为异常的服务器发出相同的命令，并观察到外壳挂起而没有任何响应，从而证实了这一点。 Ruby将无限期地等待来自SSH通道的响应，从而停止监视脚本的进度。我如何告诉Ruby几秒钟后超时？

我已经在Google和Stack Overflow上闻了闻，但是到目前为止，我找到的唯一答案是指向Ruby库Timeout :: timeout的解决方案，以及多篇文章解释了为什么我不应该使用该库：

到目前为止，我发现最好的选择是在Net：SSH发送的channel.exec命令前面加上“超时10”，从而建立Linux shell施加的10秒限制：

channel.exec("timeout 10 sleep infinity") do |ch, command_executed|

这在直接在终端控制台中进行测试时似乎确实有效：

bash-4.1$ time timeout 10 sleep infinity
real    0m10.001s
user    0m0.001s
sys     0m0.000s

我对上述解决方案的担心是，它依赖于远程服务器可靠地自我施加其自身的时间限制。首先运行监视脚本的全部原因是要检测远程服务器上何时发生故障，这意味着我无法假定它们将可靠地执行我发送的任何命令。尽管此解决方案肯定比没有解决方案要好，但我更喜欢监视器不依赖于它应该监视的对象。理想情况下，我希望Ruby能够独立于远程服务器强加自己的超时时间。

这里有一些示例代码可用于重现症状。请注意，在IRB中运行此命令将导致IRB会话挂起，同时它无限期地等待即将到来的响应，因此您必须使用ctrl-c终止它。

我已经用ruby-2.4.1。对此进行了测试。请注意，要对此进行测试，您将需要一个可运行的SSH服务器和一个具有SSH权限的用户帐户。

def wait_forever_for_ssh_response()
   command_response = "" # initialize command_response to empty string to keep its scope local

   # SSH to @remote_server and run "sleep infinity"
   begin
      Net::SSH.start(@remote_server, @remote_user, :verify_host_key => :never, :timeout => 15) do |ssh_session|
         ssh_session.open_channel do |channel|
            channel.exec("sleep infinity") do |ch, command_executed|

               # report command results with return code 0 (command completed and exited normally)
               channel.on_data do |ch, command_return|
                  puts command_return
               end

               # report command results with all other return codes (command exited with error)
               channel.on_extended_data do |ch, type, command_error|
                  puts command_error
               end

               # report SSH channel closed
               channel.on_close do |ch|
                  puts "closing connection to #{@remote_server}"
               end
            end      # end channel.exec

            # report SSH failed to open channel
            channel.on_open_failed do |ch, failure_reason_code, failure_description|
               puts " SSH connection to #{@remote_server} failed: #{failure_reason_code} #{failure_description}"
            end
         end         # end ssh_session.open_channel

         ssh_session.loop    # this registers the above channel callback handlers (channel.on_[event])
      end    # end Net::SSH.start

   # report any other errors
   rescue => error
      puts "command failed: #{error.inspect} #{error.backtrace.inspect}"
   end       # end begin

   puts command_response
end          # end method wait_forever_for_ssh_response()


# load net/ssh library, set server and user names, and connect to @remote_server
require 'net/ssh'
@remote_server = "your SSH server name or IP address goes here"
@remote_user = "your SSH user name goes here"
wait_forever_for_ssh_response

理想的结果是Ruby将等待10秒，并且如果远程服务器没有响应，请关闭SSH会话并在此行之后继续执行：

end    # end Net::SSH.start

实际结果是Ruby无限期地等待来自SSH通道的响应。

如果远程channel.exec需要太长时间才能完成，如何终止Ruby Net :: SSH会话？

0 个答案: