我如何使用nokogiri,通过xpath获取图像,但我的主要问题是,我可以拥有这个div,但没有图像:
image_node = @get_doc.xpath( '//*[@id="recaptcha_image"]/img/@src').map {|a| a.value }
#binding.pry
if image_node != nil
rec = Net::HTTP.get( URI.parse( "#{image_node['src']}" ) )
end
但是我得到了
in `[]': can't convert String into Integer (TypeError)
如何正确使用?
html的一部分:
<div id="recaptcha_widget" style="display: none">
<div id="recaptcha_image">
<img *****>
</div>
<input type="text" id="recaptcha_response_field" name="recaptcha_response_field"
style="width: 295px">
答案 0 :(得分:2)
对于大多数HTML查询和许多XML查询,我建议使用基于XPath的CSS。使用CSS使其非常“可见”:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<div id="recaptcha_widget" style="display: none">
<div id="recaptcha_image">
<img src="path_to_image.jpg">
</div>
<input type="text" id="recaptcha_response_field" name="recaptcha_response_field" style="width: 295px">
EOT
doc.at('#recaptcha_widget img')['src'] # => "path_to_image.jpg"
怎么做检查,如果我有div,但没有图像?
如何检查<img>
内是否包含嵌入式<div>
标记?将查询分为两部分,并检查nil
:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<div id="recaptcha_widget" style="display: none">
<div id="recaptcha_image">
<img src="path_to_image.jpg">
</div>
<div id="recaptcha_image2">
</div>
<input type="text" id="recaptcha_response_field" name="recaptcha_response_field" style="width: 295px">
EOT
img = doc.at('#recaptcha_widget img')
img_src = img['src'] # => "path_to_image.jpg"
如果<img>
标记不存在,您将获得nil
:
img = doc.at('#recaptcha_widget2 img') # => nil
从那时起,您将继续检查是否设置了img
:
if (img)
# ...do something...
end
或者,使用尾随rescue
来捕获nil异常并将nil
分配给img_src
,然后对其进行测试:
img_src = doc.at('#recaptcha_widget img')['src'] rescue nil # => "path_to_image.jpg"
img_src = doc.at('#recaptcha_widget2 img')['src'] rescue nil # => nil
if (img_src)
# do something
end