在php中需要有关正则表达式替换的帮助

时间:2011-08-25 14:42:17

标签: php regex preg-replace

我有一个字符串,其中包含此模式的链接:

<a href="http://randomurl.com/random_string;url=http://anotherrandomurl.com/">xxxx</a>

我想删除“http://xxx.xxx.xxx/random_string;url=”并保留其余字符串,最后留下

<a href="http://anotherrandomurl.com/">xxxx</a>

有人可以帮忙吗?

4 个答案:

答案 0 :(得分:1)

使用:

$new_link = preg_replace('/<a href="(?:.+);url=([^"]+)">/', '<a href="$1">', $url);

答案 1 :(得分:1)

有多种方法可以达到您想要的效果。 regex的替代方法是使用strpos查找url=的出现,并删除这些字符和前面的字符。

答案 2 :(得分:1)

这比你想象的要复杂,我敦促你avoid using regex for it

相反,您应该使用HTML解析器查找文档中的所有<a>标记,然后将href属性拆分为;url=并仅保留最后一部分。

但是,如果您必须使用正则表达式,则以下内容适用于大多数格式正确的HTML:

preg_replace('/(<\s*a\s[^>]*href=)(["\'])(?:[^\1]*;url=)([^\1]*)(\1[^>]*>)/i', "$1$2$3$4", $url)

说明:

(<\s*a\s[^>]*\bhref=) # <a, optionally followed by other attributes, and then href. Whitespace is ignored. This will be captured in backreference $1.
(["\'])               # Either " or ' to enclose the href value. This will be captured in $2 for matching later.
(?:[^\1]*;url=)       # Any number of URLs followed by ";url=". This will be thrown out.
([^\1]*)              # This is the URL you want to keep. It will keep matching until the end of the quotes. This will be captured into $3.
(\1[^>]*>)            # The remainder of the <a> tag, including any other attributes. This is captured in $4.

答案 3 :(得分:0)

$new_link = preg_replace('~(\shref=")[^"]+?(?<=;url=)~', '$1', $url);