将字符串拆分为散列数组,并在分隔符上使用数据

时间:2014-01-22 13:19:11

标签: html ruby string

我有一个如下字符串:

lorep ipsum <a href="#" class="link-1">dolor sit</a>amet, consectetur <a href="#" class="link-2">adipiscing</a> elit.

我需要将其拆分为片段,但保存锚点内片段的链接类。如此完美的结果将是:

['lorep ipsum ', {'link-1' => 'dolor sit'}, 'amet, consectetur', {'link-2' => 'adipiscing'}, ' elit.']<br />

或者:

['lorep ipsum ', ['link-1', 'dolor sit'], 'amet, consectetur', ['link-2', 'adipiscing'], ' elit.']

我尝试过使用此代码:

string.split(/<[^>]>/)

但它返回只返回一个片段数组。

1 个答案:

答案 0 :(得分:0)

我会使用Nokogiri

require 'nokogiri'

doc = Nokogiri::HTML.parse <<-eot
lorep ipsum <a href="#" class="link-1">dolor sit</a>amet, consectetur <a href="#" class="link-2">adipiscing</a> elit.
eot

ary = doc.search("//a").flat_map do |n,a|
   [n.previous_sibling.text.strip,{n['class'] => n.text.strip},n.next_sibling.text.strip]
end.uniq

p ary

<强>输出

["lorep ipsum", {"link-1"=>"dolor sit"}, "amet, consectetur", {"link-2"=>"adipis
cing"}, "elit."]