Ruby正则表达式或字符串操作

时间:2014-10-06 21:27:03

标签: ruby-on-rails ruby regex string

我正在尝试创建一个Rails应用程序,该应用程序接收RSS提要并向页面显示新闻故事。

当我得到文章的摘要时,我将其保存为字符串,但每个故事的摘要在最后都有很多标记,这是不必要的。例如:

The Miami Dolphins have suspended a defensive lineman after he allegedly touched women and then took an "aggressive fighting stance" when police attempted to arrest him, according to a probable cause affidavit.<div class="feedflare">
<a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=4W6duenqKrY:kemJFf3BScg:V_sGLiPBpWU" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=qj6IDK7rITs" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=4W6duenqKrY:kemJFf3BScg:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/rss/cnn_topstories/~4/4W6duenqKrY" height="1" width="1"/>

我想要保留的唯一一行是:

  据可能的原因宣誓书称,迈阿密海豚在涉嫌触碰女性后,暂停一名防守线人,然后在警方试图逮捕他时采取“激进的战斗姿态”。

我想解析包括<div class="feedflare">标记之后的所有内容。

我很难过如何做到这一点。如果有人可以请提供我可以用来执行此操作的ruby字符串操作方法或正则表达式方法,我将非常感激。我已经被困在这一段时间了,因为我是Ruby和正则表达式的新手。

2 个答案:

答案 0 :(得分:2)

你标记了rails,所以我假设你正在使用它。 Rails内置了一个很好的清理帮手:

[6] pry(main)> HTML::FullSanitizer.new.sanitize('The Miami Dolphins have suspended a defensive lineman after he allegedly touched women and then took an "aggressive fighting stance" when police attempted to arrest him, according to a probable cause affidavit.<div class="feedflare">')
=> "The Miami Dolphins have suspended a defensive lineman after he allegedly touched women and then took an \"aggressive fighting stance\" when police attempted to arrest him, according to a probable cause affidavit."

有多种方法可以帮助解决此问题,请同时查看strip_linkstrip_tagshere

答案 1 :(得分:0)

我更仔细地看一下这个简单的ER表达就足够了。看一看。

a=%Q[The Miami Dolphins have suspended a defensive lineman after he allegedly touched women and then took an "aggressive fighting stance" when police attempted to arrest him, according to a probable cause affidavit.<div class="feedflare">
<a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=4W6duenqKrY:kemJFf3BScg:V_sGLiPBpWU" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:qj6IDK7rITs"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?d=qj6IDK7rITs" border="0"></img></a> <a href="http://rss.cnn.com/~ff/rss/cnn_topstories?a=4W6duenqKrY:kemJFf3BScg:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/rss/cnn_topstories?i=4W6duenqKrY:kemJFf3BScg:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/rss/cnn_topstories/~4/4W6duenqKrY" height="1" width="1"/>]

     a.gsub(/\<[^\>]+\>/m, "")