Question

我正试图让错误的单词过滤器工作。到目前为止，使用下面的代码，如果我输入下面数组中列出的“bad1”这样的坏词，则不会发生过滤，我收到此错误：

警告：preg_match（）[function.preg-match]：未知修饰符'/'

以下是代码：

if (isset($_POST['text'])) {

// Words not allowed
$disallowedWords = array(
'bad1',
'bad2',
);
// Search for disallowed words.
// The Regex used here should e.g. match 'are', but not match 'care'
foreach ($disallowedWords as $word) {
if (preg_match("/\s+$word\s+/i", $entry)) {
die("The word '$word' is not allowed...");
}
}

// Variable contains a regex that will match URLs

$urlRegex = '/(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-
9\.&amp;%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]
{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1
-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)
\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost
|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.
(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-z
A-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&amp;%\$#\=~_\-]+))*/';

// Search for URLs
if (preg_match($urlRegex, $entry)) {
die("URLs are not allowed...");
}

}

Answer 1

这是匹配单词的正确方法。在foreach循环中使用此正则表达式。

preg_match("#\b" . $word . "\b#", $entry);

您还可以测试正则表达式here。使用/\bbad1\b/g。

代码付诸行动：

<?php
// delete the line below in your code
$entry = "notbad1word bad1 bad notbad1.";

$disallowedWords = array(
    'bad1',
    'bad2',
);

foreach ($disallowedWords as $word)
{ // use $_POST['text'] instead of $entry
    preg_match("#\b". $word ."\b#", $entry, $matches); 
    if(!empty($matches))
        die("The word " . $word . " is not allowed.");
}

echo "All good.";

此代码与notbad1word或notbad2word（依此类推）不匹配，但仅匹配bad1或bad2。

关于您的urlRegex，您必须使用/这样的\转义\/：{{1}}

Answer 2

你可以在没有慢速循环的情况下做到这一点：

<?php

$_POST['text'] = 'This sentence uses the nobad1 bad2 word!';

if (isset($_POST['text'])) {

    // Words not allowed
    $disallowedWords = array(
        'bad1',
        'bad2',
    );

    $pattern = sprintf('/(\\s%s\\s)/i', implode('\\s|\\s',$disallowedWords));
    $subject = ' '.$_POST['text'].' ';
    if (preg_match($pattern, $subject, $token)) {
        die(sprintf("The word '%s' is not allowed...\n", trim($token[1])));
    }
}

您必须确保单词目录不包含/，(或)的任何字符。

Answer 3

您使用/作为分隔字符，但不要逃避其内部＆＃34; OCCURENCES：

$urlRegex = '/(http|https|ftp)\://whatever/';
//                               ^ Unknown modifier ‘/’

更改分隔符，或者转义斜杠。

关于你的坏话＆＃34;过滤器：

无法识别字符串开头和结尾的单词。请考虑使用\b（字边界）代替\s+。
如果数组中的任何坏词都有未转义的正则表达式字符，结果可能会出乎意料。考虑对数组中的每个单词使用preg_quote。
n preg_match对 n 字的调用效率不高。我建议将单词数组压缩到像'/\b(word1|word2|word3)\b/i'这样的单个正则表达式中。

坏词正则表达式过滤器不起作用

3 个答案: