Question

我在使用preg_replace时遇到了麻烦。我的正则表达式不起作用。我有以下代码：

$text = "<p><i>Dernière modification le 20/06/2016 à 10:27</i></p><p>Some text...</p>";
$preg = preg_replace("/<p><i>Dernière modification le[\s\d\/\:\à]{1,}<\/i><\/p>$/", '', $text);

我想从变量$ text中删除这一行知道日期可以改变这就是为什么我没有使用str_replace：

 <p><i>Dernière modification le 20/06/2016 à 10:27</i></p>

感谢您的帮助

Answer 1

我尝试了正则表达式和DOMDocument方式，看起来像正则表达式更适合这种情况（特别是，如果您正在处理来自已知，值得信赖的提供商的字符串）：

//Regex way
$re = '/<p><i>Dernière modification le[\s\d\/:à]+<\/i><\/p>/u'; 
$str = "<p><i>Dernière modification le 20/06/2016 à 10:27</i></p><p>Some text...</p>"; 
$result = preg_replace($re, "", $str);
echo $result . "\n";

// DOMDocument way
$text = "<p><i>Dernière modification le 20/06/2016 à 10:27</i></p><p>Some text...</p>";
$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML(mb_convert_encoding($str, 'HTML-ENTITIES', 'UTF-8'), LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
$links = $xpath->query('//p/i[starts-with(.,"Dernière modification le")]');
foreach ($links as $link) {
    $link->parentNode->removeChild($link);
}
echo preg_replace('~(</?p>)+~', '$1', html_entity_decode($dom->saveHTML()));

请参阅IDEONE demo

使用正则表达式时，在使用Unicode字符串时应使用/u修饰符。您还应该删除$，因为您所需的文本不在字符串的末尾。

当使用DOMDocument和HTML Unicode字符串作为输入时，您需要解析它，然后获取p i，Dernière modification le文本以'//p/i[starts-with(.,"Dernière modification le")]'开头（我建议）使用像gem savon这样的XPath。

Preg_replace正则表达式日期

1 个答案: