使用Nokogiri提取此字符串

时间:2014-04-30 05:19:40

标签: html ruby-on-rails ruby xpath nokogiri

使用xpath或css,有人可以建议一种方法来提取这个字符串:

" 7天内#34;

从:

<div class="thing text-text" data-thing-id="29966403">
  <div class="thinguser"><i class="ico ico-water ico-blue"></i>
    <div class="status">in 7 days
    </div>
  </div>
  <div class="ignore-ui pull-right"><input type="check box" >
  </div>
  <div class="col_a col text">
    <div class="text">foobar
    </div>
  </div>
  <div class="col_b col text">
    <div class="text">foobar desc
    </div>
  </div>
</div>

chrome的xpath看起来像:

 //*[@id="content"]/div/div/div[2]/div[4]/div[1]/div

提前谢谢你, 〜克里斯

2 个答案:

答案 0 :(得分:1)

使用at_css

doc.at_css('div.thing > div.thinguser > div.status').text

答案 1 :(得分:0)

替代解决方案:

require 'nokogiri'

html = %q{ 
  <html>
   <body>
    <div class="thing text-text" data-thing-id="29966403">
    <div class="thinguser"><i class="ico ico-water ico-blue"></i>
      <div class="status">in 7 days
      </div>
    </div>
    <div class="ignore-ui pull-right"><input type="check box" >
    </div>
    <div class="col_a col text">
      <div class="text">foobar
      </div>
    </div>
    <div class="col_b col text">
      <div class="text">foobar desc
      </div>
    </div>
   </div>
 </body>
</html>
}

doc = Nokogiri::XML(html)
status = doc.at_css('.status')

puts status.text