Question

我正在尝试使用yahoo管道从内容中提取网址，但为此我需要匹配网址之前的所有内容，以及之后的所有内容：

<div class="medium mode player"><div class="info-header"><a rel="nofollow" target="_blank" 
href="http://i1.sndcdn.com/artworks-000059185212-dsb68g-crop.jpg?3eddc42" class="artwork" 
style="background:url(http://i1.sndcdn.com/artworks-000059185212-dsb68g-badge.jpg?
3eddc42);">Dream ft. Notorious BIG Artwork</a> <h3><a rel="nofollow" target="_blank" 
href="http://soundcloud.com/tom-misch/dream-ft-notorious-big">Dream ft. Notorious BIG</a>
</h3> <span class="subtitle"><span class="user tiny online"><a rel="nofollow" 
target="_blank" href="http://soundcloud.com/tom-misch" class="user-name">Tom Misch</a>

我想要的网址是：http://soundcloud.com/tom-misch/dream-ft-notorious-big

我试着学习一些有关正则表达式的知识，但是当我认为我理解时，我尝试的一切都没有用

希望你们中的一些人可以帮助我这些家伙！欢呼声

Answer 1

This probably will do，它只匹配来自soundcloud的网址，它使用http协议且没有子域名，该组将捕获完整的网址，以便您可以使用它，并使用{{ 3}}匹配第一个引用：

(http://soundcloud.*?)"

lazy quantifier：，它不使用惰性量化器，而是使用Here is an alternative来匹配除引号之外的任何内容：

(http://soundcloud[^"]+)

请记住，两个正则表达式实际上都匹配两个URL，具体取决于库和您使用的标志，它可能只返回第一个或两个，您可以只使用第一个或进一步检查结果是否正确格式。

如果您真的只想使用正则表达式并且您的正则表达式库支持前瞻，那么您可以这样做：

(http://soundcloud.*?)\s+(?!class="user-name")

如果后面的字符串为(?!=

，则前瞻class="user-name"将不匹配

我也没有，找到yahoo管道使用的库，如果要替换url周围的所有内容，可以将正则表达式更改为：

^.*?(http://soundcloud[^"]+).*$

在替换字符串中使用$1来获取网址（请记住，我将.*?与[^"]+混合，这是因为我想用第一个替换整个字符串url而不是第二个，所以我需要第一个.*来匹配第一个url的点并停止，这就是懒惰的量词，如果是的话。）

正则表达式匹配字符串之前的内容

1 个答案: