我目前的问题是,我有一个充满单词及其缩写的表格,我将其输入两个数组,以便稍后使用preg_replace
。
$search[] = '/\b'.$row['word'].'\b/i'; --WORDS TO ABBREVIATE
$abbrev[] = $row['abbrev']; --LIST OF ABBREVIATIONS
//search and replace
for($i = 0; $i<count($search); $i++)
{
$title = = preg_replace($search[$i],$abbrev[$i], $title);
}
一切似乎都运转正常,但我遇到的问题是转换不正确。
天堂的地下室 - &gt;天。地下室
S上。是南方的缩写
我如何确保符号/标点符号后的单词或字符不会被替换?任何帮助都会受到完全赞赏,因为我对正则表达式的理解是有限的。
答案 0 :(得分:1)
而不是\b
您可以使用assertions并检查空格\s
并开始(^
):
'/(?<=\s|^)' . $row['word'] . '\b/i'
现在,单词必须以空格(或字符串的开头)开头,而不是任何“非单词”字符。
单独注意,您不需要循环,preg_replace
也适用于数组:
$title = preg_replace($search, $abbrev, $title);
更新:我在断言语法中出错了。现在它可以工作:running example
测试代码:
$rows = [
['word' => 'S', 'abbrev' => 'South'],
['word' => 'W', 'abbrev' => 'West'],
['word' => 'N', 'abbrev' => 'North'],
['word' => 'E', 'abbrev' => 'East'],
];
foreach ($rows as $row) {
$search[] = '/(?<=\s|^)' . preg_quote($row['word'], '/') . '\b/i';
$abbrev[] = $row['abbrev'];
}
$title = "Heaven's Basement in W Virginia";
echo preg_replace($search, $abbrev, $title);
测试结果:
Heaven's Basement in West Virginia
更新2 :您可以使用前瞻断言和$
(字符串结束)而不是^
后的字词执行相同的操作(字符串开头)
'/(?<=\s|^)' . preg_quote($row['word'], '/') . '(?=\s|$)/i';
答案 1 :(得分:0)
您可以使用'/(?<![\'])\b'.$row['word'].'\b/i'
之类的negative lookbehind。
答案 2 :(得分:0)
您可能最好将输入标记为输入并检查回调中的替换,而不是多次搜索每个要替换的单词。
function abbreviate($str, $abbr) {
// Define a "word" as a utf-8 string beginning with a letter character
// followed by as many letter characters or apostrophes as possible.
// You will probably have to tweak this.
$re = '/\w[\w\']*/u';
$callback = function($matches) use ($abbr) {
$replacement = $matches[0];
$word = mb_strtolower($matches[0], 'utf-8');
if (isset($abbr[$word])) {
$replacement = $abbr[$word];
}
return $replacement;
};
return preg_replace_callback($re, $callback, $str);
}
echo abbreviate("Heaven's Basement\n", array('s'=>'S.'));
echo abbreviate("S College Rd\n", array('s' => 'South', 'rd'=> 'Road'));
打印:
Heaven's Basement
South College Road