Question

我有一堆字符串，每个字符串都包含一个锚标记和网址。

string ex。

here is a link <a href="http://www.google.com">http://www.google.com</a>. enjoy!

我想解析锚标签以及介于两者之间的所有内容。

结果ex。

here is a link. enjoy!

href =部分中的网址并不总是与链接文本匹配（有时会缩短网址，有时只是描述性文字）。

我很难找到如何使用正则表达式或php函数执行此操作。如何从字符串中解析整个锚标记/链接？

谢谢！

Answer 1

看看你的结果示例，看起来你只是删除了标签/内容 - 你想要保留你删除的内容吗？如果不是，您可能正在寻找strip_tags()。

Answer 2

您不应该使用regex to parse html而是使用html解析器。

但是如果您应该使用正则表达式，并且您的锚标记内部内容保证不含</a>之类的html，并且每个字符串保证只包含一个锚标记，如示例中所示，那么 - 仅那么 - 你可以使用类似的东西：

用/^(.+)<a.+<\/a>(.+)$/

替换$1$2

Answer 3

由于您的问题似乎非常具体，我认为应该这样做：

$str = preg_replace('#\s?<a.*/a>#', '', $str);

Answer 4

只使用普通的PHP字符串函数。

$str='here is a link <a href="http://www.google.com">http://www.google.com</a>. enjoy!';
$s = explode("</a>",$str);
foreach($s as $a=>$b){
    if( strpos( $b ,"href")!==FALSE ){
        $m=strpos("$b","<a");
        echo substr($b,0,$m);
    }
}   
print end($s);

输出

$ php test.php
here is a link . enjoy!

Answer 5

$string = 'here is a link <a href="http://www.google.com">http://www.google.com</a>. enjoy!';
$text = strip_tags($string);
echo $text; //Outputs "here is a link . enjoy!"

php锚标签正则表达式

5 个答案: