我正在尝试能够进行全局异常捕获,以便在发生错误时添加额外信息。我有两个班,“履带”和“亚马逊”。我想要做的是能够调用“爬行”,在亚马逊中执行一个函数,并在爬行函数中使用异常处理。
以下是我的两个课程:
require 'mechanize'
class Crawler
Mechanize.html_parser = Nokogiri::HTML
def initialize
@agent = Mechanize.new
end
def crawl
puts "crawling"
begin
#execute code in Amazon class here?
rescue Exception => e
puts "Exception: #{e.message}"
puts "On url: #{@current_url}"
puts e.backtrace
end
end
def get(url)
@current_url = url
@agent.get(url)
end
end
class Amazon < Crawler
#some code with errors
def stuff
page = get("http://www.amazon.com")
puts page.parser.xpath("//asldkfjasdlkj").first['href']
end
end
a = Amazon.new
a.crawl
有没有办法可以在“抓取”中调用“东西”,这样我就可以对整个东西函数使用异常处理了?有没有更好的方法来实现这一目标?
编辑: 这是我最后做的事情
require 'mechanize'
class Crawler
Mechanize.html_parser = Nokogiri::HTML
def initialize
@agent = Mechanize.new
end
def crawl
yield
rescue Exception => e
puts "Exception: #{e.message}"
puts "On url: #{@current_url}"
puts e.backtrace
end
def get(url)
@current_url = url
@agent.get(url)
end
end
c = Crawler.new
c.crawl do
page = c.get("http://www.amazon.com")
puts page.parser.xpath("//asldkfjasdlkj").first['href']
end
答案 0 :(得分:0)
我设法通过“超级”和占位符功能获得所需的功能。还有更好的方法吗?
require 'mechanize'
class Crawler
Mechanize.html_parser = Nokogiri::HTML
def initialize
@agent = Mechanize.new
end
def stuff
end
def crawl
stuff
rescue Exception => e
puts "Exception: #{e.message}"
puts "On url: #{@current_url}"
puts e.backtrace
end
def get(url)
@current_url = url
@agent.get(url)
end
end
class Amazon < Crawler
#some code with errors
def stuff
super
page = get("http://www.amazon.com")
puts page.parser.xpath("//asldkfjasdlkj").first['href']
end
end
a = Amazon.new
a.crawl
答案 1 :(得分:0)
您可以抓取接受代码块:
def crawl
begin
yield
rescue Exception => e
# handle exceptoin
end
end
def stuff
crawl do
# implementation of stuff
end
end
我并不是因为没有身体的方法而疯狂。代码块在这里可能更有意义。根据您的要求,也可以消除子类化的需要。
答案 2 :(得分:0)
如果你想要另一种方式,请看一下“策略”设计模式:
# test_mach.rb
require 'rubygems'
require 'mechanize'
# this is the context class,which calls the different strategy implementation
class Crawler
def initialize(some_website_strategy)
@strategy = some_website_strategy
end
def crawl
begin
@strategy.crawl
#execute code in Amazon class here?
rescue Exception => e
puts "==== starts this exception comes from Parent Class"
puts e.backtrace
puts "==== ends this exception comes from Parent Class"
end
end
end
# strategy class for Amazon
class Amazon
def crawl
puts "now crawling amazon"
raise "oh ... some errors when crawling amazon"
end
end
# strategy class for taobao.com
class Taobao
def crawl
puts "now crawling taobao"
raise "oh ... some errors when crawling taobao"
end
end
然后运行此代码:
amazon = Crawler.new(Amazon.new)
amazon.crawl
taobao = Crawler.new(Taobao.new)
taobao.crawl
结果:
now crawling amazon
==== starts this exception comes from Parent Class
test_mach.rb:27:in `crawl'
test_mach.rb:13:in `crawl'
test_mach.rb:38
==== ends this exception comes from Parent Class
now crawling taobao
==== starts this exception comes from Parent Class
test_mach.rb:34:in `crawl'
test_mach.rb:13:in `crawl'
test_mach.rb:40
==== ends this exception comes from Parent Class
顺便说一句。 对于你的实现,我基本上和你一样。除了
# my implementation
class Crawler
def stuff
raise "abstract method called"
end
end
如果您想要另一种方式,请查看“别名”(&lt;&lt;&lt; metaprogramming ruby&gt;&gt;,第155页)。但是我认为“围绕别名”是战略的复原案例。
(我的意思是,
错误...希望我没有让你困惑^ _ ^)