Gem Resque Error - 从超类中覆盖它之后未定义的“方法执行”

时间:2012-03-26 06:54:15

标签: ruby-on-rails redis mechanize resque

首先感谢大家帮助像我这样的程序员在解决日常问题时提出的宝贵意见。

这是我在堆栈溢出中的第一个问题,因为我几乎在一周内遇到了这个问题。

我们正在构建一个抓取特定网站并从中提取内容的抓取工具,我们正在使用机械化来实现这一点,因为我们花了很多时间来决定使用redis resque将爬行过程作为后台任务运行宝石,但在将过程发送到后台时,我遇到了错误标题,

我在lib / parsers / home.rb中的代码

require 'resque'
require File.dirname(__FILE__)+"/../index"
class Home < Index
Resque.enqueue(Index , :page )

def self.perform(page)

super (page)


search_form = page.form_with :name=>"frmAgent"
resuts_page = search_form.submit
total_entries = resuts_page.parser.xpath('//*[@id="PagingTable"]/tr[2]/td[2]').text

if total_entries =~ /(\d+)\s*$/
  total_entries = $1
else
  total_entries = "unknown"
end
start_res_idx = 1
while true
  puts "Found #{total_entries} entries"
  detail_links = resuts_page.parser.xpath('//*[@id="MainTable"]/tr/td/a')
  detail_links.each do |d_link|
    if d_link.attribute("class")
      next
    else
      data_page = @agent.get d_link.attribute("href")
      fields = get_fields_from_page data_page
      save_result_page page.uri.to_s, fields
      #break
    end
  end

  site_done

 rescue Exception => e
 puts "error: #{e}"
 end
end

和lib / index.rb中的超类是

require 'resque'
require 'mechanize'
require 'mechanize/form'


class Index

@queue = :Index_queue


def initialize(site)
@site = site
@agent = Mechanize.new
@agent.user_agent = Mechanize::AGENT_ALIASES['Windows Mozilla']
@agent.follow_meta_refresh = true
@rows_parsed = 0
@rows_total = 0
rescue Exception => e
log "Unable to login: #{e.message}"
end


 def run
  log "Parsing..."
  url = "unknown"
  if @site.url
  url = @site.url
  log "Opening #{url} as a data page"
  @page = @agent.get(url)
  #perform method should be override in subclasses
  @data = self.perform(@page)
  else
  #some sites do not have "datapage" URL
  #for example after login you're already on your very own datapage
  #this is to be addressed in 'perform' method of subclass
  @data = self.perform(nil)
  end


 rescue Exception=>e
 puts "Failed to parse URL '#{url}', exception=>"+e.message
 set_site_status("error "+e.message)
 end
 #overriding method
 def self.perform(page)
 end

 def save_result_page(url, result_params)
   result = Result.find_by_sql(["select * from results where site_id = ? AND ref_code = ?", @site.id, utf8(result_params[:ref_code])]).first
   if result.nil?

  result_params[:site_id] = @site.id
  result_params[:time_crawled] = DateTime.now().strftime "%Y-%m-%d %H:%M:%S"
  result_params[:link] = url
  result = Result.create result_params
  else
  result.result_fields.each do |f|
    f.delete
  end
  result.link = url
  result.time_crawled = DateTime.now().strftime "%Y-%m-%d %H:%M:%S"
  result.html = result_params[:html]
  fields = []
  result_params[:result_fields_attributes].each do |f|
    fields.push ResultField.new(f)
  end
  result.result_fields = fields
  result.save
  end
  @rows_parsed +=1
  msg = "Saved #{@rows_parsed}"
  msg +=" of #{@rows_total}" if @rows_total.to_i > 0
  log msg
  return result
 end
 end

这段代码有什么问题?

谢谢

0 个答案:

没有答案