我有这个字符串:
"<span class='break'><div class='name-and-date'><strong>Mr. Talon
Williamson - Dec 18, 1:47 PM Eastern</div></strong><div class='note-
contents'>- wrong</div></span><span class='break'><div class='name-and-
date'><strong>Mr. Talon Williamson - Dec 18, 1:47 PM Eastern</div>
</strong><div class='note-contents'>- Wrong again</div></span><span
class='break'><div class='name-and-date'><strong>Mr. Talon Williamson
- Dec 18, 1:47 PM Eastern</div></strong><div class='note-contents'>-
okay what is the matter with you.</div></span><span class='break'><div
class='name-and-date'><strong>Mr. Talon Williamson - Dec 18, 1:50 PM
Eastern</div></strong><div class='note-contents'>- Bro!</div></span>"
如何从此字符串中删除最后一个范围,以便获得此返回值:
"<span class='break'><div class='name-and-date'><strong>Mr. Talon
Williamson - Dec 18, 1:47 PM Eastern</div></strong><div class='note-
contents'>- wrong</div></span><span class='break'><div class='name-and-
date'><strong>Mr. Talon Williamson - Dec 18, 1:47 PM Eastern</div>
</strong><div class='note-contents'>- Wrong again</div></span><span
class='break'><div class='name-and-date'><strong>Mr. Talon Williamson
- Dec 18, 1:47 PM Eastern</div></strong><div class='note-contents'>-
okay what is the matter with you.</div></span>"
我理解使用Nokogiri进行html解析是更好的做法,但对于我的用例,保持字符串的完整性非常重要。这意味着它必须完全相同,除了去掉最后一个跨度。
我想做这样的事情:
string.scan(/<span class='break'>/)
但是,它并没有抓住整个字符串并将它们分解为数组元素。
答案 0 :(得分:2)
看看这是否有帮助。这是你在找什么?
txt = "<span class='break'><div class='name-and-date'><strong>Mr. Talon
Williamson - Dec 18, 1:47 PM Eastern</div></strong><div class='note-
contents'>- wrong</div></span><span class='break'><div class='name-and-
date'><strong>Mr. Talon Williamson - Dec 18, 1:47 PM Eastern</div>
</strong><div class='note-contents'>- Wrong again</div></span><span
class='break'><div class='name-and-date'><strong>Mr. Talon Williamson
- Dec 18, 1:47 PM Eastern</div></strong><div class='note-contents'>-
okay what is the matter with you.</div></span><span class='break'><div
class='name-and-date'><strong>Mr. Talon Williamson - Dec 18, 1:50 PM
Eastern</div></strong><div class='note-contents'>- Bro!</div></span>"
txt.rindex('<span')
# => 540
txt.rindex('</span')
# => 700
txt[txt.rindex('<span'), txt.rindex('</span')]
# => "<span class='break'><div \n class='name-and-date'><strong>Mr. Talon Williamson - Dec 18, 1:50 PM \n Eastern</div></strong><div class='note-contents'>- Bro!</div></span>"
txt[txt.rindex('<span'), txt.rindex('</span')] = ""
txt
# => "<span class='break'><div class='name-and-date'><strong>Mr. Talon \n Williamson - Dec 18, 1:47 PM Eastern</div></strong><div class='note-\n contents'>- wrong</div></span><span class='break'><div class='name-and-\n date'><strong>Mr. Talon Williamson - Dec 18, 1:47 PM Eastern</div>\n </strong><div class='note-contents'>- Wrong again</div></span><span \n class='break'><div class='name-and-date'><strong>Mr. Talon Williamson \n - Dec 18, 1:47 PM Eastern</div></strong><div class='note-contents'>- \n okay what is the matter with you.</div></span>"
答案 1 :(得分:2)
你可以通过多种方式实现这一目标。
假设你在txt变量中有那个字符串
txt.split("<span class='break'")[0..-2].join("<span class='break")
很容易上班。这只是问题。
答案 2 :(得分:0)
除非嵌套span
具有类"break"
,否则以下情况会有效。
input.scan(%r|<span\s+class=['"]break["']>.*?</span>|m)[0...-1].join
稍慢,但始终按预期工作:
input[%r|.*(?=<span\s+class=['"]break["']>.*?</span>\z)|m]
后一种解决方案使用positive lookahead来捕获最后一个模式的所有内容,紧接着是字符串结尾(\z
。)
关于String#[]
的更多信息,以正则表达式作为参数。