Question

我有以下代码：

//I am the !test! stuff1 all!! string
//I am the !test! all!! string

输出：

我正在使用preg_quote转发$clid = "I am Jörg Müller with special chars"，为什么？

我遇到了第二个问题，现在已经解决，但我不知道为什么会这样。假设$patterns = array_filter($patterns);。如果我删除代码行preg_replace，则I am J之后的输出为array_filter。我找不到原因，但我用function translateInput(){ for(i = 0; i < ('input').length; i++){ ('input').eq(i).val(('value').eq(i).text()); } } var translateText = function() { var translationType = document.getElementById('translation').value; if (translationType === 'englishToFrench') { console.log('translation used: English to French'); return 'code1'; }else if(translationType === 'frenchToEnglish'){ console.log('translation used: French to English'); return 'code2'; }else{ return "No valid translation selected."; } };解决了这个问题。

谢谢

Answer 1

问题是您正在使用\b来声明word boundaries。但是，"!"字符不是word character而\b与" !"不匹配。

这些是$clid中的单词边界：

 I   a m   t h e   ! t e s t !   s t u f f 1   a l l ! !   s t r i n g
^ ^ ^   ^ ^     ^   ^       ^   ^           ^ ^     ^     ^           ^

您可以使用lookarounds断言每个项目是：

(?:-[- ]?| +)匹配-[ ]，-，--或一个或多个空格。
(?:-[- ]?|(?= )|$)匹配-[ ]，-，--或断言后跟空格或行尾。

<强>正则表达式

$pattern = '/(?:-[- ]?| +)(?:'.implode('|', $patterns).')(?:-[- ]?|(?= )|$)/i';

<强>代码

//Array filled with data from external file
$patterns = array('!test!', 'stuff1', 'all!!', '');

//Delete empty values in array
$patterns = array_filter($patterns);

foreach($patterns as &$item){
       $item = preg_quote($item);
}

$pattern = '/(?:-[- ]?| +)(?:'.implode('|', $patterns).')(?:-[- ]?|(?= )|$)/i';


$clid = "I am the !test! stuff1 all!! string and !test!! not matched";
$clid = trim(preg_replace($pattern, '', $clid));

echo $clid;

<强>输出

I am the string and !test!! not matched

ideone demo

至于你的第二个问题，你的数组中有一个空项。所以正则表达式会变成：

(?:option1|option2|option3|)
                           ^

注意那里有第四个选项：空子模式。一个空的子模式总是匹配。您的正则表达式可以解释为：

/(\b|^|- |--|-)(-|--| -|\b|$)/i

这就是你有意想不到的结果的原因

array_filter()通过删除空项来解决您的问题。

Answer 2

我将这样做的方式：

$clid = "I am the !test! stuff1 all!! string";

$items = ['!test!', 'stuff1', 'all!!', ''];

$pattern = array_reduce($items, function ($c, $i) {
    return empty($i) ? $c : $c . preg_quote($i, '~') . '|';
}, '~[- ]+(?:');

$pattern .= '(*F))(?=[- ])~u';

$result = preg_replace($pattern, '', ' ' . $clid . ' ');
$result = trim($result, "- \t\n\r\0\x0b");

demo

这个想法是在＆＃34;＆＃34;之后检查空格或连字符。带着前瞻。通过这种方式，这个＆＃34;分隔符＆＃34;没有消耗，模式可以处理连续的匹配。

为避免模式开头的交替（如(?:[- ]|^)[- ]*那么慢），我在源代码字符串的开头添加一个空格，用trim替换后删除。< / p>

(*F)（强制模式失败）仅在此处，因为项目的替换是使用array_reduce构建的，最后允许尾随|。

使用u修饰符解决了ASCII范围之外的字符问题。使用此修饰符，正则表达式引擎能够处理UTF-8编码的字符串。

preg_replace无法按预期使用通配符

2 个答案: