Question

我有一个机械化功能可以让我退出网站，但在非常罕见的情况下它会让我失望。该功能涉及转到特定页面，然后单击注销按钮。偶尔机械化在进入注销页面或单击注销按钮时会出现超时，代码崩溃。所以我进行了一次小规模的救援，它似乎正如第一段代码所示。

def logmeout(agent)
  page = agent.get('http://www.example.com/')
  agent.click(page.link_with(:text => /Log Out/i))
end

带救援的Logmeout：

def logmeout(agent)
  begin
  page = agent.get('http://www.example.com/')
  agent.click(page.link_with(:text => /Log Out/i))
  rescue Timeout::Error 
    puts "Timeout!"
    retry
  end
end

假设我正确理解了救援，即使只是点击超时，它也会做两个动作，所以为了提高效率，我想知道我是否可以在这种情况下使用proc并将其传递给代码块。这样的事情会起作用吗？

def trythreetimes
  tries = 0
  begin
  yield
  rescue
    tries += 1
    puts "Trying again!"
    retry if tries <= 3
  end
end

def logmeout(agent)
  trythreetimes {page = agent.get('http://www.example.com/')}
  trythreetimes {agent.click(page.link_with(:text => /Log Out/i))}
end

请注意，在我的trythreetimes函数中，我将其保留为通用救援，因此该功能将更易于重复使用。

非常感谢任何人提供的任何帮助，我知道这里有几个不同的问题，但它们都是我想要学习的东西！

Answer 1

我认为您最好将Mechanize::HTTP::Agent::read_timeout属性设置为合理的秒数，例如2或5，而不是在某些机械化请求上重试一些超时，无论如何都会阻止此请求的超时错误。

然后，您的注销过程似乎只需要访问简单的HTTP GET请求。我的意思是没有表格可以填写，所以没有HTTP POST请求。所以如果我是你，我会优先检查页面源代码（使用Firefox或Chrome的Ctrl + U），以便识别agent.click(page.link_with(:text => /Log Out/i))到达的链接它应该更快，因为这些类型的页面通常是空白的，而Mechanize不必在内存中加载完整的html网页。

以下是我更喜欢使用的代码：

def logmeout(agent)
  begin
  agent.read_timeout=2  #set the agent time out
  page = agent.get('http://www.example.com/logout_url.php')
  agent.history.pop()   #delete this request in the history
  rescue Timeout::Error 
    puts "Timeout!"
    puts "read_timeout attribute is set to #{agent.read_timeout}s" if !agent.read_timeout.nil?
    #retry      #retry is no more needed
  end
end

但你也可以使用你的重试功能：

def trythreetimes
  tries = 0
  begin
  yield
  rescue Exception => e  
  tries += 1
  puts "Error: #{e.message}"
  puts "Trying again!" if tries <= 3
  retry if tries <= 3
  puts "No more attempt!"
  end
end

def logmeout(agent)
  trythreetimes do
  agent.read_timeout=2  #set the agent time out
  page = agent.get('http://www.example.com/logout_url.php')
  agent.history.pop()       #delete this request in the history
  end
end

希望它有所帮助！ ; - ）

Answer 2

使用mechanize 1.0.0我从另一个错误来源得到了这个问题。

在我的情况下，我被代理阻止，然后是SSL。这对我有用：

ag = Mechanize.new
ag.set_proxy('yourproxy', yourport)
ag.agent.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
ag.get( url )

使用ruby mechanize捕获超时错误

2 个答案: