Ruby gem mechanize抛出一个错误:未定义的方法`< =>'

时间:2010-11-01 21:53:09

标签: ruby

我正在使用Ruby gem mechanize来抓取一些HTML ...当我加载我的页面并显示必要的结果时,该页面很好。重新加载后,在执行“search_results = @ agent.submit(search_form)”时出现此错误:

undefined method `<=>' for {emptyelem <input name="hl" value="en" type="hidden">}:Hpricot::Elem

在我发布任何代码之前,这只会响铃吗?

感谢。

代码:

    start = Time.now

    # initial set up
    @agent = Mechanize.new
    Mechanize.html_parser = Hpricot
    page = @agent.get("http://www.google.com/")
    search_form = page.forms.first

    # conduct initial search
    @search_term = search_form.q = params[:search].to_s
    search_results = @agent.submit(search_form)

    # helper variables
    search_qs = ""; @page_number = 1; i = 0; @flag = false;

    # get the query string structure
    search_results.links.each { |li| search_qs = li.href if li.href.match(/.*search\?q=.*start=.*/) }

    # search through all paginated pages
    while (i < 500)
      search_qs = search_qs.gsub(/start=\d+/,"start=#{i}")
      @search_url = "http://google.com#{search_qs}"
      search_results = @agent.get(@search_url)
      search_results.links.each { |li| @flag = true if li.text.match("All Bout Texas Tailgating") }
      break if @flag
      i+=10; @page_number+=1
    end

@execution_time = Time.now-start

render :layout => false

查看:

<h2>Query results for "<%= @search_term %>" on Google</h2>

<% if @flag %>
    <p>What page is this keyword found: <b><%= @page_number %></b></p>
    <p><%= link_to  "Click to see page", "#{@search_url}", {:target => "_blank"} %></p>
    <p>How long did this query take to run?: <%= @execution_time %> seconds</p>
<% else %>
    <p>Keyword not found in Google search reults</p>
<% end %>

STACK TRACE:

 NoMethodError (undefined method `<=>' for {emptyelem <input name="hl" value="en" type="hidden">}:Hpricot::Elem):
  mechanize (1.0.0) lib/mechanize/form/field.rb:30:in `<=>'
  mechanize (1.0.0) lib/mechanize/form.rb:171:in `sort'
  mechanize (1.0.0) lib/mechanize/form.rb:171:in `build_query'
  mechanize (1.0.0) lib/mechanize.rb:373:in `submit'
  app/controllers/admin/importer_controller.rb:24:in `check_page_rank'
  /opt/local/lib/ruby/1.8/webrick/httpserver.rb:104:in `service'
  /opt/local/lib/ruby/1.8/webrick/httpserver.rb:65:in `run'
  /opt/local/lib/ruby/1.8/webrick/server.rb:173:in `start_thread'
  /opt/local/lib/ruby/1.8/webrick/server.rb:162:in `start'
  /opt/local/lib/ruby/1.8/webrick/server.rb:162:in `start_thread'
  /opt/local/lib/ruby/1.8/webrick/server.rb:95:in `start'
  /opt/local/lib/ruby/1.8/webrick/server.rb:92:in `each'
  /opt/local/lib/ruby/1.8/webrick/server.rb:92:in `start'
  /opt/local/lib/ruby/1.8/webrick/server.rb:23:in `start'
  /opt/local/lib/ruby/1.8/webrick/server.rb:82:in `start'

Rendered rescues/_trace (98.4ms)
Rendered rescues/_request_and_response (1.2ms)
Rendering rescues/layout (internal_server_error)

1 个答案:

答案 0 :(得分:0)

因此,如果您查看form.rb中的source for mechanize - 表单提交正在调用一个名为build_query的函数,该函数对表单上的字段进行排序。由于sort使用&lt; =&gt;运算符,并且在Hpricot元素上未定义,您将获得异常。

似乎机械化是为了使用Nokogiri构建的 - 它可能与其他解析实现有不一致的错误。我没有深入到机械化的来源,也不想责怪任何人,但你可能想尝试切换到Nokogiri这个项目(如果可能的话)。从这个片段看起来好像你在很大程度上依赖于Hpricot。对我来说,机械化在Hpricot的隐藏表单字段上抛出异常似乎很奇怪,但堆栈跟踪在这方面非常清楚。

你的另一个主要选择是跳进机械化源,看看你是否可以自己修复它(或者在机械化github上提交一个bug并希望有人得到它。)

祝你好运。