匹配所有内容直到下一场比赛

时间:2013-07-04 11:04:00

标签: regex

我希望匹配一个HTML代码,直到下一次出现......或结束。

目前我有以下正则表达式:

(<font color=\"#777777\">\.\.\. .+?<\/font>)

只会与之匹配:

1. <font color="#777777">... </font><font color="#000000">lives up to the customer's expectations. The subscriber is </font>
2. <font color="#777777">... You may not want them to be </font>
3. <font color="#777777">... </font><font color="#000000">the web link, and </font>

但我想要:

1. <font color="#777777">... </font><font color="#000000">lives up to the customer's expectations. The subscriber is </font><font color="#777777">obviously thinking about your merchandise </font><font color="#000000">in case they have clicked about the link in your email.</font>
2. <font color="#777777">... You may not want them to be </font><font color="#000000">disappointed by simply clicking </font>
3. <font color="#777777">... </font><font color="#000000">the web link, and </font><font color="#777777">finding </font><font color="#000000">the page to </font><font color="#777777">get other than </font><font color="#000000">what they thought it </font><font color="#777777">will be.. If America makes</font>

这是我要解析的html:

<font color="#777777">... </font><font color="#000000">lives up to the customer's expectations. The subscriber is </font><font color="#777777">obviously thinking about your merchandise  </font><font color="#000000">in case they have clicked about the link in your email.</font><font color="#777777">... You may not want them to be </font><font color="#000000">disappointed by simply clicking </font><font color="#777777">... </font><font color="#000000">the web link, and </font><font color="#777777">finding  </font><font color="#000000">the page to </font><font color="#777777">get other than  </font><font color="#000000">what they thought it </font><font color="#777777">will be.. If America makes</font>

示范: http://rubular.com/r/mmQ4TBZb96

如何匹配所有以...开头的文字以获得上述所需的匹配?

感谢您的帮助!

2 个答案:

答案 0 :(得分:2)

即使你的问题似乎不一致(我不明白为什么你会得到最后的期望匹配),我认为这就是你所追求的:

((<font color=\"#777777\">\.{3}) .+?(<\/font>(?=\s*\2)|$))

它使用预测来使匹配结束只是之前下一个“...”序列(或输入结束。

请参阅this on rubular

答案 1 :(得分:0)

问题是关于regexp,但你也可以通过以下方式完成它(Perl语法,但我相信这种函数也存在于其他语言中):

split(/(?=<font color=\"#777777\">\.\.\.)/, $your_text)