Question

我想确保用户输入不包含<，>或&#等字符，无论是文本输入还是文本区域。我的模式：

var pattern = /^((?!&#|<|>).)*$/m;

问题是，它仍然匹配来自textarea的多行字符串，如

此文字匹配

虽然这不应该，因为这个字符＆lt;

编辑：

为了更清楚，我只需要排除&#组合，而不是&或#。

请建议解决方案。非常感谢。

Answer 1

在这种情况下，我认为你不需要一个外观断言。只需使用否定的字符类：

var pattern = /^[^<>&#]*$/m;

如果您还不允许使用以下字符-，[，]，请务必将其删除或按正确顺序排列：

var pattern = /^[^][<>&#-]*$/m;

Answer 2

您可能不会在Javascript中查找m（多行）切换但s（DOTALL）切换。很遗憾，Javascript中不存在s。

然而，可以使用[\s\S] 模拟DOTALL的好消息。请尝试以下正则表达式：

/^(?![\s\S]*?(&#|<|>))[\s\S]*$/

OR：

/^((?!&#|<|>)[\s\S])*$/

Live Demo

Answer 3

具体问题的替代答案：

anubhava的解决方案可以正常工作，但速度很慢，因为它必须在字符串中的每个字符位置执行负向前瞻。更简单的方法是使用反向逻辑。即，不是验证/^((?!&#|<|>)[\s\S])*$/ 匹配，而是验证/[<>]|&#/ NOT 匹配。为了说明这一点，让我们创建一个函数：hasSpecial()，它测试一个字符串是否有一个特殊的字符。这是两个版本，第一个使用anubhava的第二个正则表达式：

function hasSpecial_1(text) {
    // If regex matches, then string does NOT contain special chars.
    return /^((?!&#|<|>)[\s\S])*$/.test(text) ? false : true;
}
function hasSpecial_2(text) {
    // If regex matches, then string contains (at least) one special char.
    return /[<>]|&#/.test(text) ? true : false;
}

这两个功能在功能上是等效的，但第二个功能可能要快得多。

请注意，当我最初阅读此问题时，我误解了它确实要排除HTML特殊字符（包括HTML实体）。如果是这种情况，那么以下解决方案就是这样做的。

测试字符串是否包含HTML特殊字符：

OP似乎希望确保字符串不包含任何特殊的HTML字符，包括：<，>，以及十进制和十六进制HTML实体，例如：  ， 等。如果是这种情况，则解决方案可能还应排除其他（命名）类型的HTML实体，例如：&，<等。下面的解决方案排除所有三种形式的HTML实体以及<>标记分隔符。

以下是两种方法:(请注意，如果它不是有效HTML实体的一部分，则两种方法都允许序列：&#。）

使用正面正则表达式进行FALSE测试：

function hasHtmlSpecial_1(text) {
    /* Commented regex:
        # Match string having no special HTML chars.
        ^                  # Anchor to start of string.
        [^<>&]*            # Zero or more non-[<>&] (normal*).
        (?:                # Unroll the loop. ((special normal*)*)
          &                # Allow a & but only if
          (?!              # not an HTML entity (3 valid types).
            (?:            # One from 3 types of HTML entities.
              [a-z\d]+     # either a named entity,
            | \#\d+        # or a decimal entity,
            | \#x[a-f\d]+  # or a hex entity.
            )              # End group of HTML entity types.
            ;              # All entities end with ";".
          )                # End negative lookahead.
          [^<>&]*          # More (normal*).
        )*                 # End unroll the loop.
        $                  # Anchor to end of string.
    */
    var re = /^[^<>&]*(?:&(?!(?:[a-z\d]+|#\d+|#x[a-f\d]+);)[^<>&]*)*$/i;
    // If regex matches, then string does NOT contain HTML special chars.
    return re.test(text) ? false : true;
}

请注意，上述正则表达式使用了Jeffrey Friedl的“Unrolling-the-Loop”效率技术，并且对于匹配和非匹配情况都会非常快速地运行。（见他的正则表达杰作：Mastering Regular Expressions (3rd Edition)）

使用负正则表达式的真实测试：

function hasHtmlSpecial_2(text) {
    /* Commented regex:
        # Match string having one special HTML char.
          [<>]           # Either a tag delimiter
        | &              # or a & if start of
          (?:            # one of 3 types of HTML entities.
            [a-z\d]+     # either a named entity,
          | \#\d+        # or a decimal entity,
          | \#x[a-f\d]+  # or a hex entity.
          )              # End group of HTML entity types.
          ;              # All entities end with ";".
    */
    var re = /[<>]|&(?:[a-z\d]+|#\d+|#x[a-f\d]+);/i;
    // If regex matches, then string contains (at least) one special HTML char.
    return re.test(text) ? true : false;
}

另请注意，我已经以JavaScript注释的形式包含了这些（非平凡）正则表达式的注释版本。

使用多行匹配排除某些字符的正则表达式

3 个答案:

Live Demo

具体问题的替代答案：

测试字符串是否包含HTML特殊字符：

使用正面正则表达式进行FALSE测试：

使用负正则表达式的真实测试：