使用Nokogiri解析HTML并使用a
选择class="favorite"
元素时:
galleries = doc.css(".favourite a")
#doc variable contains return of Nokogiri::HTML(source_page.body)
puts galleries
返回:
<a href="/galleries/6730">...</a>
<a href="/favourites/40565414">...</a>
<a href="/galleries/10851">...</a>
<a href="/favourites/40850848">...</a>
如何仅提取/galleries/[0-9]+
属性的href
值?
答案 0 :(得分:1)
galleries.xpath("@href[contains(., 'galleries')]").map(&:value)
# => ["/galleries/6730", "/galleries/10851"]
答案 1 :(得分:1)
使用更多Ruby和更少XPath
doc.css('.favourite a').map{ |a| a['href'][%r{galleries/\d+}] }.compact