Question

我有一个代码，

$text = "This is a $1ut ( Y ) @ss @sshole a$$ ass test with grass and passages.";
$blacklist = array(
  '$1ut',
  '( Y )',
  '@ss',
  '@sshole',
  'a$$',
  'ass'
);
foreach ($blacklist as $word) {
  $pattern = "/\b". preg_quote($word) ."\b/i";
  $replace = str_repeat('*', strlen($word));
  $text = preg_replace($pattern, $replace, $text);
}
print_r($text);

返回以下结果：

This is a $1ut ( Y ) @ss @sshole a$$ *** test with grass and passages.

当我从regexp中删除单词边界时，

$pattern = "/". preg_quote($word) ."/i";

它返回：

This is a **** ***** *** ***hole *** *** test with gr*** and p***ages.

如何编写正则表达式，以便它不会替换passages，grass等单词，而是完全替换为@sshole？

Answer 1

根据this \b不支持[A-Za-z0-9_]以外的任何内容。

请注意，有来逃避正则表达式，因为您是从字符串生成它（而PHP编译器在创建此字符串时，不知道它是正则表达式）

使用正则表达式/(^|\s)WORD($|\s)/i似乎有效。

代码示例：

$text = "This is a $1ut ( Y ) @ss @sshole a$$ ass test with grass and passages.";
$blacklist = array(
  '$1ut',
  '( Y )',
  '@ss',
  '@sshole',
  'a$$',
  'ass'
);
foreach ($blacklist as $word) {
  $pattern = "/(^|\\s)" . preg_quote($word) . "($|\\s)/i";
  $replace = " " . str_repeat('*', strlen($word)) . " ";
  $text = preg_replace($pattern, $replace, $text);
}
echo $text;

输出：

This is a **** ***** *** ******* *** *** test with grass and passages.

请注意，如果您的字符串以其中一个单词开头或结尾，我们将在每一端为匹配添加一个空格，这意味着在文本之前或之后会有一个空格。您可以使用trim()

处理此问题

<强>更新

另请注意，这不会以任何方式解释标点符号。

the other user has an ass. and it is nice会以此为例。

要征服这一点，你可以进一步扩展它：

/(^|\\s|!|,|\.|;|:|\-|_|\?)WORD($|\\s|!|,|\.|;|:|\-|_|\?)/i

这意味着你还必须改变我们的替换方式：

$text = "This is a $1ut ( Y ) @ss?@sshole you're an ass. a$$ ass test with grass and passages.";
$blacklist = array(
  '$1ut',
  '( Y )',
  '@ss',
  '@sshole',
  'a$$',
  'ass'
);
foreach ($blacklist as $word) {
  $pattern = "/(^|\\s|!|,|\\.|;|:|\\-|_|\\?)" . preg_quote($word) . "($|\\s|!|,|\\.|;|:|\\-|_|\\?)/i";
  $replace = '$1' . str_repeat('*', strlen($word)) . '$2';
  $text = preg_replace($pattern, $replace, $text);
}
echo $text;

并添加所有其他标点符号等。

输出：

This is a **** ***** ***?******* you're an ***. *** *** test with grass and passages.

如何编写正确的正则表达式模式来删除列出的单词

1 个答案: