Question

代码：

$str = 'http://www.google.com <img src="http://placehold.it/350x150" />';
$str = preg_replace('/\b(https?):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i', '', $str);
echo $str;

输出：

<img src="" />

我需要这个输出：

<img src="http://placehold.it/350x150" />

我该怎么做？

感谢您的帮助。

Answer 1

你的模式

/\b(https?):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i

删除以协议http或https开头的所有网址（在字符串中）。因此，当您将其应用于字符串时，它将删除位于字符串开头的URL和src <img>的URL。因此，您必须在模式的开头使用^：

$str = 'http://www.google.com <img src="http://placehold.it/350x150" />';
$str = preg_replace('/^\b(https?):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i', '', $str);
echo $str;

Online Demo

或者简单地得到你需要的东西：

/(<img.*\/>)/i

Online Demo

Answer 2

我还认为DOMDocument和DOMXPath是解析HTML标记的首选工具。
但只是在你的特定情况下，这里是regexp 负向后观断言的解决方案：

$str = 'http://www.google.com <img src="http://placehold.it/350x150" /> http://www.google.com.ua';

$str = preg_replace('/(?<!src=\")(https|http):\/\/[^\s]+\b/i', '', $str);

print_r($str);   // <img src="http://placehold.it/350x150" />

除了img src 属性中的内容之外，这将删除所有网址

Answer 3

尝试：

<[^>]*(*SKIP)(*FAIL)|\b(https?):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]

<[^>]*抓住未关闭的<和(*SKIP)(*FAIL)|内的所有内容。

其余的是你的正则表达式。

如何在没有img src的情况下删除http和https

3 个答案:

Online Demo

Online Demo