已解决 - " abc = list.scan(/ [([^]] +)] /)。last.first"行是正确的,但也包括引号,网站搜索表格不接受。将其更正为abc = list.scan(/ \"([^]] +)\" /)。join。
感谢您的帮助。
我必须使用csv文件中的100个关键字列表自动执行搜索。
使用Mechanize,我可以使用此示例(http://mechanize.rubyforge.org/GUIDE_rdoc.html)提交搜索:
agent = Mechanize.new
page = agent.get('http://google.com/')
google_form = page.form('f')
google_form.q = 'ruby mechanize'
page = agent.submit(google_form)
pp page
然而,当我循环遍历csv文件时,它会返回一个错误(在这个例子中,第一个csv条目将是' ruby mechanize':
#i have already imported the csv list, now it is looping through the array "raw_list"
raw_list.each do |list|
abc = list.scan(/\[([^\)]+)\]/).last.first
# i tested a "puts abc" which returned "ruby mechanize", so I don't understand why the rest of this doesn't work
agent = Mechanize.new
page = agent.get('http://google.com/')
google_form = page.form('f')
google_form.q = abc
#even though abc = "ruby mechanize", an error occurs.
page = agent.submit(google_form)
pp page
似乎没有采用变量&#34; abc&#34; ,但如果您手动输入&#39; ruby mechanize&#39; < / strong>即使两者都相同。
出现的错误是:
C:filename: in `block (2 levels) in <top (required)>': undefined method `text' for nil:NilClass (NoMethodError)
from C:/RailsInstaller/Ruby2.0.0/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize.rb:442:in `get'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:23:in `block in <top (required)>'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:19:in `each'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:19:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
任何帮助都将不胜感激。
答案 0 :(得分:0)
您的错误告诉您代码中第19行的某些内容导致了机械化中第442行的问题。
我在IRB中尝试了你的样本,似乎工作正常:
2.2.2 :001 > require 'mechanize'
=> true
2.2.2 :002 > agent = Mechanize.new
=> #<Mechanize:...
2.2.2 :003 > page = agent.get('http://google.com/')
=> #<Mechanize::Page
...
2.2.2 :004 > google_form = page.form('f')
=> #<Mechanize::Form
...
2.2.2 :005 > google_form.q
=> ""
2.2.2 :006 > abc = "ruby mechanize"
=> "ruby mechanize"
2.2.2 :007 > google_form.q = abc
=> "ruby mechanize"
2.2.2 :008 > page = agent.submit(google_form)
=> #<Mechanize::Page
...
如果没有找到任何内容,扫描将返回nil,因此您的错误发生在此处:
abc = list.scan(/\[([^\)]+)\]/).last.first
http://ruby-doc.org/stdlib-2.2.0/libdoc/strscan/rdoc/StringScanner.html
您可以将其替换为:
abc = list.scan(/\[([^\)]+)\]/).join
你总是得到一个字符串,虽然它可能只是“”。