Question

我有一个像这样的字符串。

$dot_prod = "at the coast will reach the Douglas County coast";

我希望使用正则表达式得到此结果：在海岸会到达道格拉斯县海岸

具体来说，我想将单词“ coast”和“ the”加粗，但如果不加单词“ county”，则仅加单词coast，如果不加单词“ at”，则仅加单词“ the”。因此，从本质上讲，我希望将单词或短语的数组（不区分大小写，以保持单词/短语最初所在的大小写）加粗，然后使我要确保的单词或短语的数组不加粗。例如，我要加粗的单词/短语数组是：

$bold = array("coast", "the", "pass");

我想确保的单词数组是：

$unbold = array("county coast", "at the", "grants pass");

我可以这样做：

$bold = array("coast", "the", "pass");

$dot_prod = preg_replace("/(" . implode("|", $bold) . ")/i", "<b>$1</b>", $dot_prod);

但是，我在之后的取消操作上一直不成功，而且我绝对无法弄清楚如何在一个表达式中完成所有操作。您能提供什么帮助吗？谢谢。

Answer 1

您可以匹配并跳过要“取消加粗”的模式，并在任何其他情况下匹配要加粗的模式。

构建这样的正则表达式（我添加了单词边界以匹配整个单词，您可能不必使用它们，但这对于您当前的输入而言似乎是个好主意）：

'~\b(?:county coast|at the|grants pass)\b(*SKIP)(*F)|\b(?:coast|the|pass)\b~i'

请参见regex demo。

详细信息

\b-单词边界
(?:county coast|at the|grants pass)-任何替代方案
\b-单词边界
(*SKIP)(*F)-PCRE动词可跳过当前匹配并从当前匹配的结尾继续查找匹配
|-或
\b-单词边界
(?:coast|the|pass)-任何替代方案
\b-单词边界。

替换中的$0是对整个匹配值的引用。

PHP demo：

$dot_prod = "at the coast will reach the Douglas County coast";
$bold = array("coast", "the", "pass");
$unbold = array("county coast", "at the", "grants pass");
$rx = "~\b(?:" . implode("|", $unbold) . ")\b(*SKIP)(*F)|\b(?:" . implode("|", $bold) . ")\b~i";
echo preg_replace($rx, "<b>$0</b>", $dot_prod);
// => at the <b>coast</b> will reach <b>the</b> Douglas County coast

一个警告：由于搜索词可以包含空格，因此在构建模式之前，最好按长度降序对$bold和$unbold数组进行排序：

usort($unbold, function($a, $b) { return strlen($b) - strlen($a); });
usort($bold, function($a, $b) { return strlen($b) - strlen($a); });

请参见another PHP demo。

如果这些项目可能包含特殊的正则表达式元字符，请在其上也使用preg_quote。

字符串中的粗体字仅在某些字词之前没有

1 个答案: