Question

我正在学习如何使用nokogiri，根据以下代码，我找到的问题很少

require 'rubygems'
require 'mechanize'

post_agent = WWW::Mechanize.new
post_page = post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')

puts "\nabsolute path with tbody gives nil"
puts  post_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]').xpath('text()').to_s.strip.inspect

puts "\n.at_xpath gives an empty string"
puts post_page.parser.at_xpath("//div[@id='posts']/div/table/tr/td/div[2]").at_xpath('text()').to_s.strip.inspect

puts "\ntwo lines solution with .at_xpath gives an empty string"
rows =   post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")
puts rows[0].at_xpath('text()').to_s.strip.inspect


puts
puts "two lines working code"
rows =   post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")
puts rows[0].xpath('text()').to_s.strip

puts "\none line working code"
puts post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")[0].xpath('text()').to_s.strip

puts "\nanother one line code"
puts post_page.parser.at_xpath("//div[@id='posts']/div/table/tr/td/div[2]").xpath('text()').to_s.strip

puts "\none line code with full path"
puts post_page.parser.xpath("/html/body/div/div/div/div/div/table/tr/td/div[2]")[0].xpath('text()').to_s.strip

在xpath中使用//或/更好吗？ @AnthonyWJones说'使用未加前缀的//'并不是一个好主意
我不得不从任何工作的xpath中删除tbody，否则我得到'nil'结果。如何从xpath中删除元素以使其工作？
如果不使用完整的xpath，我必须使用.xpath两次提取数据吗？
为什么我不能让.at_xpath工作来提取数据？它运作得很好here有什么区别？

Answer 1

//表示每个级别的每个节点，因此与/
您可以使用*作为占位符。
不，您可以进行XPath查询，获取元素然后在节点上调用nokogiri text方法
当然可以。看看this question和我的基准文件。您将看到at_xpath的示例。

我发现你经常使用text()表达式。使用Nokogiri不需要这样做。您可以检索节点，然后在节点上调用text方法。它便宜得多。

另请注意Nokogiri支持.css选择器。如果您使用HTML页面，它们会更容易。

如何使用nokogiri方法.xpath＆amp; .at_xpath

1 个答案: