在网页中假设我有以下值:
<td> <a href="https://www.test.com/test123/a.html"> test11 </a> </td>
<td> <a href="https://www.test.com/test12333/r.html"> test12 </a> </td>
<td> <a href="https://www.test.com/testaa123/t.html"> test21 </a> </td>
<td> <a href="https://www.test.com/test123123/b.html"> test31 </a> </td>
无论如何使用Ruby找到值test21
?
或者有没有找到具有子串href
的{{1}}值?
答案 0 :(得分:1)
为Nokogiri试试这个tutorial。
<li>
代码的示例:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
PAGE_URL = "http://ruby.bastardsbook.com/files/hello-webpage.html"
page.css('li')[0].text
这将从以下网站输出YouTube:
<div id="funstuff">
<p>Here are some entertaining links:</p>
<ul>
<li><a href="http://youtube.com">YouTube</a></li>
<li><a data-category="news" href="http://reddit.com">Reddit</a></li>
<li><a href="http://kathack.com/">Kathack</a></li>
<li><a data-category="news" href="http://www.nytimes.com">New York Times</a></li>
</ul>
</div>