修复您当前的方法

Question

我正在尝试突出显示/粗体化来自字符串的匹配单词。以下函数适用于英语，但不支持Unicode。我尝试在正则表达式规则中为Unicode支持添加u，但对我不起作用。

function highlight_term($text, $words)
{
    preg_match_all('~[A-Za-z0-9_äöüÄÖÜ]+~u', $words, $m);
    if( !$m )
    {
        return $text;
    }

    $re = '~(' . implode('|', $m[0]) . ')~i';
    return preg_replace($re, '<b>$0</b>', $text);
}

$str = "ह ट इ ड यन भ भ और द";

echo highlight_term($str, 'और');

输出

预期输出

हуटडनन<<रद

Answer 1

修复您当前的方法

请注意，您可以将第一个正则表达式更改为~[\p{L}\p{M}]+~u以匹配所有Unicode字母（\p{L}与u修饰符变为Unicode识别并匹配任何Unicode字母）和变音符号（{{1 }}匹配组合标记）并将\p{M}修饰符添加到第二个u：

preg_replace

结果：function highlight_term($text, $words) { $i = preg_match_all('~[\p{L}\p{M}]+~u', $words, $m); if( $i == 0 ) { return $text; } $re = '~' . implode('|', $m[0]) . '~iu'; return preg_replace($re, '$0', $text); } $str = "ह ट इ ड यन भ भ और द"; echo highlight_term($str, 'और');。

请参阅PHP demo

在第二个正则表达式中需要ह ट इ ड यन भ भ और द修饰符，因为传递给模式的文本是Unicode，您仍然使用Unicode字符串。不需要第二个正则表达式中的外括号，因为您只对整个匹配值感兴趣（使用u反向引用替换）。

更好的方式

您可以将一个单词数组传递给突出显示功能，并且只将整个单词与单词边界匹配，直接将正则表达式传递给$0函数：

preg_replace

请参阅this PHP demo

正则表达式中的Unicode支持

1 个答案:

修复您当前的方法

更好的方式