内容:
<a href="http://www.lipsum.com/">Lorem Ipsum</a> is simply dummy text
of the printing and typesetting industry.
<a href="http://www.google.com/1111/2222/3333">Lorem Ipsum</a> has been the industrys
standard dummy text ever since the 1500s, when an unknown printer
took a <a href="http://gallery.com">galley</a> of type and scrambled
it to make a type specimen <a href="http://www.google.com/1111/3333/4444">book</a>.
内容包括3“a href”链接
http://www.lipsum.com/
http://www.google.com/1111/2222/3333
http://www.google.com/1111/3333/4444
http://gallery.com/
我想要这个结果:所选的href值仅为href="http://google.com/1111/3333****
Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industrys standard dummy text ever since the 1500s,
when an unknown printer took a galley of type and scrambled it to make a type
specimen <a href="http://www.google.com/1111/3333/4444">book</a>.
有人知道怎么做吗?希望你能理解这个问题。提前谢谢。
答案 0 :(得分:1)
使用正则表达式解析/转换 HTML 内容并不是一个好主意
但是对于您的小片段并考虑到您需要在删除自身时保留链接文本(例如"Lorem Ipsum"
),您可以使用以下preg_replace
解决方案:
$html = '<a href="http://www.lipsum.com/">Lorem Ipsum</a> is simply dummy text
of the printing and typesetting industry.
<a href="http://www.google.com/1111/2222/3333">Lorem Ipsum</a> has been the industrys
standard dummy text ever since the 1500s, when an unknown printer
took a <a href="http://gallery.com">galley</a> of type and scrambled
it to make a type specimen <a href="http://www.google.com/1111/3333/4444">book</a>.';
$re = '/<a href="http:\/\/(?!www\.google\.com\/1111\/3+\/[^>]+).*?>([^<>]+)<\/a>/m';
$result = preg_replace($re, "$1", $html);
echo $result;
输出:
Lorem Ipsum is simply dummy text
of the printing and typesetting industry.
Lorem Ipsum has been the industrys
standard dummy text ever since the 1500s, when an unknown printer
took a galley of type and scrambled
it to make a type specimen <a href="http://www.google.com/1111/3333/4444">book</a>.
(?!www\.google\.com\/1111\/3+\/[^>]+)
- 前瞻性否定断言,匹配链接,那些href
属性值不符合所需要求href="http://www.google.com/1111/3333****
<强> ---------- 强>
更准确的方法是使用 DOMDocument / DOMXpath 类:
$dom = new \DOMDocument();
$dom->loadHTML($html);
$xpath = new \DOMXPath($dom);
$nodes = $xpath->query("//a[not(contains(@href, 'http://www.google.com/1111/3333'))]");
foreach ($nodes as $n) {
$n->parentNode->replaceChild($dom->createTextNode($n->nodeValue), $n);
}
echo $dom->saveHTML($dom->documentElement);
输出:
<html><body>Lorem Ipsum is simply dummy text
of the printing and typesetting industry.
Lorem Ipsum has been the industrys
standard dummy text ever since the 1500s, when an unknown printer
took a galley of type and scrambled
it to make a type specimen <a href="http://www.google.com/1111/3333/4444">book</a>.</body></html>