Question

在调用searchEmails（页面）之后，不会在Harvest方法中执行代码（将“嘿”）。我可能错过了一些简单的Ruby，因为我只是回到它。

def searchEmails(page_to_search)
  begin
    html = @agent.get(url).search('html').to_s
    mail = html.scan(/['.'\w|-]*@+[a-z]+[.]+\w{2,}/).map.to_a
    base = page_to_search.uri.to_s.split("//", 2).last.split("/", 2).first
    mail.each{|e| @file.puts e+";"+base unless e.include? "example.com" or  e.include? "email.com" or  e.include? "domain.com" or  e.include? "company.com" or e.length < 9 or e[0] == "@"}
  end
end

def harvest(url)
  begin
    page = @agent.get(url)
    searchEmails(page)
    puts "hey"
  rescue Exception
  end
end

url="www.example.com"
harvest(url)

Answer 1

@agent.get(url)将因网址中断或网络中断而失败。

您的代码中的问题可以写成如下：

def do_something
  begin
    raise
    puts "I will never get here!"
  rescue
  end
end

由于你无法摆脱raise，你需要在rescue内做一些事情（最有可能记录下来）：

begin
  @agent.get(url)
  # ...
rescue Timeout::Error, Errno::EINVAL, Errno::ECONNRESET, EOFError,
       Net::HTTPBadResponse, Net::HTTPHeaderSyntaxError,   
       Net::ProtocolError => e
  log(e.message, e.callback)
end

方法调用后不执行Ruby代码

1 个答案: