Question

我有

tmp_body_symbols="things <st>hello</st> and <st>blue</st> by <st>orange</st>"
str1_markerstring = "<st>"
str2_markerstring = "</st>"
frags << tmp_body_symbols[/#{str1_markerstring}(.*?)#{str2_markerstring}/m, 1]

frags是＆＃34;你好＆＃34;但我想[＆＃34;你好＆＃34;，＆＃34;蓝＆＃34;，＆＃34;橙＆＃34;]

我是怎么做到的？

Answer 1

使用scan：

tmp_body_symbols.scan(/#{str1_markerstring}(.*?)#{str2_markerstring}/m).flatten

另请参阅：Ruby docs for String#scan。

Answer 2

您可以使用 Nokogiri 来解析HTML / XML

require 'open-uri'
require 'nokogiri' 

doc = Nokogiri::HTML::Document.parse("things <st>hello</st> and <st>blue</st> by <st>orange</st>")
doc.css('st').map(&:text)
#=> ["hello", "blue", "orange"]

更多信息：http://www.nokogiri.org/tutorials/parsing_an_html_xml_document.html

Answer 3

您可以使用捕获组执行此操作，如@Doorknob已完成，或者没有捕获组，使用（＆＃34;零宽度＆＃34;）正面后视和正向前瞻：

tmp = "things <st>hello</st> and <st>blue</st> by <st>orange</st>"
s1 = "<st>"
s2 = "</st>"

tmp.scan(/(?<=#{ s1 }).*?(?=#{ s2 })/).flatten
  #=> ["hello", "blue", "orange"]

(?<=#{ s1 })，评估为(?<=<st>)，是积极的支持。
(?=#{ s2 })，评估为(?=</st>)，是积极的支持。

?

.*

tmp.scan(/(?<=#{ s1 }).*(?=#{ s2 })/).flatten
  #=> ["hello</st> and <st>blue</st> by <st>orange"]

从Ruby中的字符串中获取多个子字符串

3 个答案: