RegExp.exec()偶尔返回NULL

时间:2011-01-18 13:37:47

标签: javascript regex

我对此非常认真,我已经花费了不成比例的时间来试图弄清楚这里发生了什么。所以,请给我一个手=)

我需要在JavaScript中对字符串进行一些RegExp匹配。不幸的是,它表现得非常奇怪。这段代码:

var rx = /(cat|dog)/gi;
var w = new Array("I have a cat and a dog too.", "There once was a dog and a cat.", "I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.");

for (var i in w) {
    var m = null;
    m = rx.exec(w[i]);
    if(m){
        document.writeln("<pre>" + i + "\nINPUT: " + w[i] + "\nMATCHES: " + m.slice(1) + "</pre>");
    }else{
        document.writeln("<pre>" + i + "\n'" + w[i] + "' FAILED.</pre>");
    }
}

返回前两个元素的“cat”和“dog”,但应该是一些exec() - 调用开始返回null。我不明白为什么。

我发布了一个小提琴here,您可以在其中运行和编辑代码。

到目前为止,我已经在Chrome和Firefox中试过这个。

干杯!

/克里斯托弗

4 个答案:

答案 0 :(得分:63)

哦,就是这样。因为您正在定义全局正则表达式,所以它首先匹配cat,然后匹配循环dog的第二遍。所以,基本上你只需要重置你的正则表达式(它的内部指针)。参看这样:

var w = new Array("I have a cat and a dog too.", "I have a cat and a dog too.", "I have a cat and a dog too.", "I have a cat and a dog too.");

for (var i in w) {
    var rx = /(cat|dog)/gi;
    var m = null;
    m = rx.exec(w[i]);
    if(m){
        document.writeln("<p>" + i + "<br/>INPUT: " + w[i] + "<br/>MATCHES: " + w[i].length + "</p>");
    }else{
        document.writeln("<p><b>" + i + "<br/>'" + w[i] + "' FAILED.</b><br/>" + w[i].length + "</p>");
    }
    document.writeln(m);
}

答案 1 :(得分:58)

正则表达式对象有一个属性lastIndex,当您运行exec时会对其进行更新。所以当你执行正则表达式时,例如“我也有一只猫和一只狗。”,lastIndex设置为12.下次在同一个正则表达式对象上运行exec时,它会从索引12开始查找。所以你必须重置每次运行之间的lastIndex属性。

答案 2 :(得分:21)

两件事:

  1. 使用g(全局)标志时,提到重置的需要。为了解决这个问题,我建议只需将0分配给lastIndex对象的RegExp成员。这比destroy-and-recreate具有更好的性能。
  2. 使用in关键字以便走Array 对象时要小心,因为某些库会导致意外结果。有时您应该检查isNaN(i)之类的某些东西,或者如果您知道它没有漏洞,请使用经典的for循环。
  3. 代码可以是:

    var rx = /(cat|dog)/gi;
    w = ["I have a cat and a dog too.", "There once was a dog and a cat.", "I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat.","I have a cat and a dog too.", "There once was a dog and a cat."];
    
    for (var i in w)
     if(!isNaN(i))        // Optional, check it is an element if Array could have some odd members.
      {
       var m = null;
       m = rx.exec(w[i]); // Run
       rx.lastIndex = 0;  // Reset
       if(m)
        {
         document.writeln("<pre>" + i + "\nINPUT: " + w[i] + "\nMATCHES: " + m.slice(1) + "</pre>");
        } else {
         document.writeln("<pre>" + i + "\n'" + w[i] + "' FAILED.</pre>");
        }
      }
    

答案 3 :(得分:4)

我在使用/ g时遇到了类似的问题,这里提出的解决方案在FireFox 3.6.8中对我不起作用。我的脚本正在使用

var myRegex = new RegExp("my string", "g");

我正在添加此项,以防其他人遇到与上述解决方案相同的问题。