我正在尝试创建一个Rails应用程序,该应用程序接收RSS提要并向页面显示新闻故事。
当我得到文章的摘要时,我将其保存为字符串,但每个故事的摘要在最后都有很多标记,这是不必要的。例如:
The Miami Dolphins have suspended a defensive lineman after he allegedly touched women and then took an "aggressive fighting stance" when police attempted to arrest him, according to a probable cause affidavit.<div class="feedflare">
<a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=4W6duenqKrY:kemJFf3BScg:V_sGLiPBpWU" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=qj6IDK7rITs" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=4W6duenqKrY:kemJFf3BScg:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/rss/cnn_topstories/~4/4W6duenqKrY" height="1" width="1"/>
我想要保留的唯一一行是:
据可能的原因宣誓书称,迈阿密海豚在涉嫌触碰女性后,暂停一名防守线人,然后在警方试图逮捕他时采取“激进的战斗姿态”。
我想解析包括<div class="feedflare">
标记之后的所有内容。
我很难过如何做到这一点。如果有人可以请提供我可以用来执行此操作的ruby字符串操作方法或正则表达式方法,我将非常感激。我已经被困在这一段时间了,因为我是Ruby和正则表达式的新手。
答案 0 :(得分:2)
你标记了rails,所以我假设你正在使用它。 Rails内置了一个很好的清理帮手:
[6] pry(main)> HTML::FullSanitizer.new.sanitize('The Miami Dolphins have suspended a defensive lineman after he allegedly touched women and then took an "aggressive fighting stance" when police attempted to arrest him, according to a probable cause affidavit.<div class="feedflare">')
=> "The Miami Dolphins have suspended a defensive lineman after he allegedly touched women and then took an \"aggressive fighting stance\" when police attempted to arrest him, according to a probable cause affidavit."
有多种方法可以帮助解决此问题,请同时查看strip_link
和strip_tags
,here
答案 1 :(得分:0)
我更仔细地看一下这个简单的ER表达就足够了。看一看。
a=%Q[The Miami Dolphins have suspended a defensive lineman after he allegedly touched women and then took an "aggressive fighting stance" when police attempted to arrest him, according to a probable cause affidavit.<div class="feedflare">
<a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=4W6duenqKrY:kemJFf3BScg:V_sGLiPBpWU" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=qj6IDK7rITs" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=4W6duenqKrY:kemJFf3BScg:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/rss/cnn_topstories/~4/4W6duenqKrY" height="1" width="1"/>]
a.gsub(/\<[^\>]+\>/m, "")