从字符串中删除最后一个跨

时间:2017-12-19 03:17:57

标签: ruby-on-rails ruby

我有这个字符串:

"<span class='break'><div class='name-and-date'><strong>Mr. Talon 
 Williamson - Dec 18,  1:47 PM Eastern</div></strong><div class='note-
 contents'>- wrong</div></span><span class='break'><div class='name-and-
 date'><strong>Mr. Talon Williamson - Dec 18,  1:47 PM Eastern</div>
 </strong><div class='note-contents'>- Wrong again</div></span><span 
 class='break'><div class='name-and-date'><strong>Mr. Talon Williamson 
 - Dec 18,  1:47 PM Eastern</div></strong><div class='note-contents'>- 
 okay what is the matter with you.</div></span><span class='break'><div 
 class='name-and-date'><strong>Mr. Talon Williamson - Dec 18,  1:50 PM 
 Eastern</div></strong><div class='note-contents'>- Bro!</div></span>"

如何从此字符串中删除最后一个范围,以便获得此返回值:

"<span class='break'><div class='name-and-date'><strong>Mr. Talon 
 Williamson - Dec 18,  1:47 PM Eastern</div></strong><div class='note-
 contents'>- wrong</div></span><span class='break'><div class='name-and-
 date'><strong>Mr. Talon Williamson - Dec 18,  1:47 PM Eastern</div>
 </strong><div class='note-contents'>- Wrong again</div></span><span 
 class='break'><div class='name-and-date'><strong>Mr. Talon Williamson 
 - Dec 18,  1:47 PM Eastern</div></strong><div class='note-contents'>- 
 okay what is the matter with you.</div></span>"

我理解使用Nokogiri进行html解析是更好的做法,但对于我的用例,保持字符串的完整性非常重要。这意味着它必须完全相同,除了去掉最后一个跨度。

我想做这样的事情:

string.scan(/<span class='break'>/)

但是,它并没有抓住整个字符串并将它们分解为数组元素。

注意:我之前问过类似的问题,我很感激帮助,但它并不是我所需要的。

3 个答案:

答案 0 :(得分:2)

看看这是否有帮助。这是你在找什么?

txt = "<span class='break'><div class='name-and-date'><strong>Mr. Talon 
 Williamson - Dec 18,  1:47 PM Eastern</div></strong><div class='note-
 contents'>- wrong</div></span><span class='break'><div class='name-and-
 date'><strong>Mr. Talon Williamson - Dec 18,  1:47 PM Eastern</div>
 </strong><div class='note-contents'>- Wrong again</div></span><span 
 class='break'><div class='name-and-date'><strong>Mr. Talon Williamson 
 - Dec 18,  1:47 PM Eastern</div></strong><div class='note-contents'>- 
 okay what is the matter with you.</div></span><span class='break'><div 
 class='name-and-date'><strong>Mr. Talon Williamson - Dec 18,  1:50 PM 
 Eastern</div></strong><div class='note-contents'>- Bro!</div></span>"

txt.rindex('<span')
# => 540 
txt.rindex('</span')
# => 700 
txt[txt.rindex('<span'), txt.rindex('</span')]
# => "<span class='break'><div \n class='name-and-date'><strong>Mr. Talon Williamson - Dec 18,  1:50 PM \n Eastern</div></strong><div class='note-contents'>- Bro!</div></span>" 
txt[txt.rindex('<span'), txt.rindex('</span')] = ""
txt
# => "<span class='break'><div class='name-and-date'><strong>Mr. Talon \n Williamson - Dec 18,  1:47 PM Eastern</div></strong><div class='note-\n contents'>- wrong</div></span><span class='break'><div class='name-and-\n date'><strong>Mr. Talon Williamson - Dec 18,  1:47 PM Eastern</div>\n </strong><div class='note-contents'>- Wrong again</div></span><span \n class='break'><div class='name-and-date'><strong>Mr. Talon Williamson \n - Dec 18,  1:47 PM Eastern</div></strong><div class='note-contents'>- \n okay what is the matter with you.</div></span>"

答案 1 :(得分:2)

你可以通过多种方式实现这一目标。

假设你在txt变量中有那个字符串 txt.split("<span class='break'")[0..-2].join("<span class='break") 很容易上班。这只是问题。

答案 2 :(得分:0)

除非嵌套span具有类"break",否则以下情况会有效。

input.scan(%r|<span\s+class=['"]break["']>.*?</span>|m)[0...-1].join

稍慢,但始终按预期工作:

input[%r|.*(?=<span\s+class=['"]break["']>.*?</span>\z)|m]

后一种解决方案使用positive lookahead来捕获最后一个模式的所有内容,紧接着是字符串结尾(\z。)

关于String#[]的更多信息,以正则表达式作为参数。