Question

有没有办法在javascript正则表达式中实现等效的negative lookbehind？我需要匹配一个不以特定字符集开头的字符串。

如果在字符串的开头找到匹配的部分，我似乎无法找到正确执行此操作的正则表达式。负面观察似乎是唯一的答案，但javascript没有。

编辑：这是我想要工作的正则表达式，但它没有：

(?<!([abcdefg]))m

所以它会匹配'jim'或'm'中的'm'，但不会匹配'jam'

Answer 1

由于Javascript支持negative lookahead，一种方法是：

反转输入字符串
与反向正则表达式相匹配
反转并重新格式化匹配

const reverse = s => s.split('').reverse().join('');

const test = (stringToTests, reversedRegexp) => stringToTests
  .map(reverse)
  .forEach((s,i) => {
    const match = reversedRegexp.test(s);
    console.log(stringToTests[i], match, 'token:', match ? reverse(reversedRegexp.exec(s)[0]) : 'Ø');
  });

示例1：

关注@ andrew-ensley的问题：

test(['jim', 'm', 'jam'], /m(?!([abcdefg]))/)

输出：

jim true token: m
m true token: m
jam false token: Ø

示例2：

关注@neaumusic评论（匹配max-height但不是line-height，标记为height）：

test(['max-height', 'line-height'], /thgieh(?!(-enil))/)

输出：

max-height true token: height
line-height false token: Ø

Answer 2

我们假设你想要找到int之前没有的所有unsigned：

支持负面观察：

(?<!unsigned )int

不支持负面观察：

((?!unsigned ).{9}|^.{0,8})int

基本上，想法是抓住前面的n个字符并排除与负前瞻的匹配，但也匹配前面没有n个字符的情况。（其中n是后视的长度）。

所以有问题的正则表达式：

(?<!([abcdefg]))m

会转换为：

((?!([abcdefg])).|^)m

您可能需要使用捕获组来查找您感兴趣的字符串的确切位置，或者您想要用其他内容替换特定部分。

Answer 3

Mijoja的策略适用于您的具体案例，但不是一般的：

js>newString = "Fall ball bill balll llama".replace(/(ba)?ll/g,
   function($0,$1){ return $1?$0:"[match]";});
Fa[match] ball bi[match] balll [match]ama

这是一个示例，其目标是匹配double-l，但不是如果它前面是“ba”。注意单词“balll” - 真正的lookbehind应该抑制前2个，但匹配第2对。但是通过匹配前2个，然后将该匹配忽略为误报，正则表达式引擎从该匹配的 end 开始，并忽略误报中的任何字符。

Answer 4

使用

newString = string.replace(/([abcdefg])?m/, function($0,$1){ return $1?$0:'m';});

Answer 5

Lookbehind Assertions在{201}中获得了accepted ECMAScript specification。这已在V8和shipped without flags with Google Chrome v62以及Node.js v6 behind a flag and v9 without a flag中实施。因此，如果您正在针对纯Chrome环境（例如Electron）或Node进行开发，那么您今天就可以开始使用lookbehinds了！

正面的背后使用：

＆＃13;

console.log(
  "$9.99  €8.47".match(/(?<=\$)\d+(\.\d*)?/) // Matches "9.99"
);

＆＃13;

负面的背后使用：

＆＃13;

console.log(
  "$9.99  €8.47".match(/(?<!\$)\d+(?:\.\d*)/) // Matches "8.47"
);

＆＃13;

支持其他平台：

Mozilla Firefox正在努力：跟踪here。
Microsoft Edge也在努力：跟踪here（用户语音suggestion）。

Answer 6

您可以通过否定字符集来定义非捕获组：

(?:[^a-g])m

...这将匹配任何这些字母前面的每个m NOT 。

Answer 7

遵循Mijoja的想法，并从JasonS暴露的问题中汲取灵感，我有了这个想法;我检查了一下，但我不确定自己，所以在js正则表达式中比我更专业的人验证会很棒：）

var re = /(?=(..|^.?)(ll))/g
         // matches empty string position
         // whenever this position is followed by
         // a string of length equal or inferior (in case of "^")
         // to "lookbehind" value
         // + actual value we would want to match

,   str = "Fall ball bill balll llama"

,   str_done = str
,   len_difference = 0
,   doer = function (where_in_str, to_replace)
    {
        str_done = str_done.slice(0, where_in_str + len_difference)
        +   "[match]"
        +   str_done.slice(where_in_str + len_difference + to_replace.length)

        len_difference = str_done.length - str.length
            /*  if str smaller:
                    len_difference will be positive
                else will be negative
            */

    }   /*  the actual function that would do whatever we want to do
            with the matches;
            this above is only an example from Jason's */



        /*  function input of .replace(),
            only there to test the value of $behind
            and if negative, call doer() with interesting parameters */
,   checker = function ($match, $behind, $after, $where, $str)
    {
        if ($behind !== "ba")
            doer
            (
                $where + $behind.length
            ,   $after
                /*  one will choose the interesting arguments
                    to give to the doer, it's only an example */
            )
        return $match // empty string anyhow, but well
    }
str.replace(re, checker)
console.log(str_done)

我的个人输出：

Fa[match] ball bi[match] bal[match] [match]ama

原则是在任意两个字符之间的字符串中的每个点调用checker，只要该位置是起点：

---任何不想要的大小的子串（这里'ba'，因此..）（如果那个大小已知;否则它可能更难做）

--- ---或小于该字符串的开头：^.?

，然后，

---实际寻求什么（这里'll'）。

每次调用checker时，都会有一项测试，以检查ll之前的值是否不是我们不想要的（!== 'ba'）;如果是这种情况，我们调用另一个函数，它必须是这个（doer）将在str上进行更改，如果目的是这个或更一般，那将输入必要的数据以手动处理str的扫描结果。

这里我们更改字符串，因此我们需要保留长度差异的跟踪，以便抵消replace给出的位置，这些位置都在str上计算，而这些位置本身永远不会改变。

因为原始字符串是不可变的，所以我们可以使用变量str来存储整个操作的结果，但我认为已经因重放而复杂的例子会更清楚地使用另一个变量（{{ 1}}）。

我认为就性能而言，它必须非常苛刻：所有这些毫无意义的替代品都是＆＃39;＆＃39;进入＆＃39;＆＃39;，str_done次，再加上doer的手动替换，这意味着很多切片...... 可能在这个特定的上述情况下可以分组，通过将字符串只剪切成一个片段，我们想要插入this str.length-1和[match]与.join()本身。

另一件事是，我不知道它将如何处理更复杂的案件，即伪造的后视的复杂价值......长度可能是最有问题的数据。

并且，在[match]中，如果$ behind的非多余值的多种可能性，我们必须使用另一个正则表达式对其进行测试（在{{之外）进行缓存（创建） 1}}最好，以避免在每次调用checker时创建相同的正则表达式对象），以了解它是否是我们想要避免的。

希望我已经清楚了;如果不是犹豫不决，我会做得更好。：）

Answer 8

这实际上是这样做的

"jim".match(/[^a-g]m/)
> ["im"]
"jam".match(/[^a-g]m/)
> null

搜索并替换示例

"jim jam".replace(/([^a-g])m/g, "$1M")
> "jiM jam"

请注意，负面的后置字符串长度必须为1个字符才能生效。

Answer 9

使用您的案例如果您想用某些内容替换 m，例如将其转换为大写M，您可以取消捕获组中的设置。

匹配([^a-g])m，替换为$1M

"jim jam".replace(/([^a-g])m/g, "$1M")
\\jiM jam

([^a-g])将匹配^范围内的所有字符（a-g），并将其存储在第一个捕获组中，以便您可以使用$1访问它。

因此，我们在im中找到了jim，并将其替换为iM，结果为jiM。

Answer 10

这是我为Node.js 8（不支持向后看）实现str.split(/(?<!^)@/)的方式：

str.split('').reverse().join('').split(/@(?!$)/).map(s => s.split('').reverse().join('')).reverse()

有效吗？是的（unicode未经测试）。不愉快？是的。

Answer 11

如前所述，JavaScript现在允许回溯。在较旧的浏览器中，您仍然需要解决方法。

我敢打赌，如果没有向后看就能准确地提供结果，就找不到找不到正则表达式的方法。您所能做的就是与小组合作。假设您有一个正则表达式(?<!Before)Wanted，其中Wanted是您要匹配的正则表达式，而Before是一个正则表达式，它计算出不该匹配的内容。您能做的最好的事情就是取消正则表达式Before并使用正则表达式NotBefore(Wanted)。期望的结果是第一组$1。

在您的情况下，Before=[abcdefg]很容易使NotBefore=[^abcdefg]无效。因此正则表达式为[^abcdefg](m)。如果您需要Wanted的位置，则也必须将NotBefore分组，以便所需的结果是第二组。

如果Before模式的匹配项具有固定长度n，也就是说，如果该模式不包含重复标记，则可以避免否定Before模式并使用正则表达式(?!Before).{n}(Wanted)，但仍必须使用第一组或使用正则表达式(?!Before)(.{n})(Wanted)并使用第二组。在此示例中，模式Before实际上具有固定长度，即1，因此请使用正则表达式(?![abcdefg]).(m)或(?![abcdefg])(.)(m)。如果您对所有比赛都感兴趣，请添加g标志，请参阅我的代码段：

function TestSORegEx() {
  var s = "Donald Trump doesn't like jam, but Homer Simpson does.";
  var reg = /(?![abcdefg])(.{1})(m)/gm;
  var out = "Matches and groups of the regex " + 
            "/(?![abcdefg])(.{1})(m)/gm in \ns = \"" + s + "\"";
  var match = reg.exec(s);
  while(match) {
    var start = match.index + match[1].length;
    out += "\nWhole match: " + match[0] + ", starts at: " + match.index
        +  ". Desired match: " + match[2] + ", starts at: " + start + ".";   
    match = reg.exec(s);
  }
  out += "\nResulting string after statement s.replace(reg, \"$1*$2*\")\n"
         + s.replace(reg, "$1*$2*");
  alert(out);
}

Answer 12

/(?![abcdefg])[^abcdefg]m/gi 是的，这是一个技巧。

Answer 13

This might help, depending on the context:

This matches the m in jim but not jam:

"jim jam".replace(/[a-g]m/g, "").match(/m/g)

Javascript：负面的lookbehind相当于？

13 个答案: