Question

我正在使用模板引擎，我尝试捕获＆lt; %%＆gt;内的所有字符串，但是当我在＆lt;％object.property％＆gt;上工作时模式，一切都失败了。我的代码：

var render = function(input, data){
    var re = /<%([^%>]+)?%>/g;
    var templateVarArray;
    // var step = "";
    while((templateVarArray = re.exec(input))!=null){
        var strArray = templateVarArray[1].split(".");
        // step+= templateVarArray[1]+" ";
        if(strArray.length==1)
            input = input.replace(templateVarArray[0], data[templateVarArray[1]]);
        if(strArray.length==2){
            input = input.replace(templateVarArray[0], data[strArray[0]][strArray[1]]);
        }
    }
    // return step;
    return input;
}
var input = "<%test.child%><%more%><%name%><%age%>";

document.write(render(input,{
    test: { child: "abc"},
    more: "MORE",
    name:"ivan",
    age: 22


}));

我的结果：

ABC＆LT;％更％GT;＆LT;％名称％GT; 22

我想要的是：abc MORE ivan 22

此外，RegExp /＆lt;％（[^％＆gt;] +）？％＆gt; / g在线引用，我确实搜索了它的含义，但仍然不确定其含义。特别是为什么它需要＆＃34; +＆＃34;和＃34;？＆＃34;，非常感谢！

Answer 1

如果你添加一个console.log（）语句，它将显示下一次搜索的发生位置：

while((templateVarArray = re.exec(input))!=null){
    console.log(re.lastIndex);    // <-- insert this
    var strArray = templateVarArray[1].split(".");
    // step+= templateVarArray[1]+" ";
    if(strArray.length==1)
        input = input.replace(templateVarArray[0], data[templateVarArray[1]]);
    if(strArray.length==2){
        input = input.replace(templateVarArray[0], data[strArray[0]][strArray[1]]);
    }
}

你会看到类似的东西：

14
26

这意味着下次运行re.exec（...）时，它将分别从索引14和26开始。因此，在替换数据后，您会遗漏一些匹配项。

正如@Alexander指出的那样，在正则表达式结束时取出'g'。现在你会看到这样的东西：

0
0

这意味着每次从字符串的开头开始搜索，您现在应该得到您想要的内容：

abcMOREivan22

关于你对RegEx的问题及其正在做的事情，让我们分开一些：

<% - this matches the literal '<' followed immediately by '%'

([^%>]+) - the brackets (...) indicate we want to capture the portion of the string that matches the expression within the brackets
  [^...] - indicates to match anything except what follows the '^'; without the '^' would match whatever pattern is within the []
  [^%>] - indicates to match and exclude a single character - either a '%' or '>'
  [^%>]+ - '+' indicates to match one or more; in other words match one or more series of characters that is not a '%' and not a '>'

? - this indicates we want to do reluctant matching (without it we do what is called 'greedy' matching)

%> - this matches the literal '%' followed immediately by '>'

要理解的最棘手的部分是'？'。在这种情况下使用它意味着我们停止匹配仍然匹配整个正则表达式的最短模式。在这种情况下，不管是否包含它都没有任何区别，尽管根据匹配模式有时候它会很重要。

建议的改进

目前的逻辑仅限于深入两层的数据。为了使它能够处理任意嵌套，你可以这样做：

首先，添加一个小函数来进行替换：

var substitute = function (str, data) {
  return str.split('.').reduce(function (res, item) {
    return res[item];
  }, data);
};

然后，将while循环更改为：

  while ((templateVarArray = re.exec(input)) != null) {
    input = input.replace(templateVarArray[0], substitute(templateVarArray[1], data));
  }

它不仅可以处理任意数量的级别，还可以找到'substitute（）'函数的其他用途。

Answer 2

RegExp.prototype.exec()文档说：

如果正则表达式使用“g”标志，则可以多次使用exec（）方法在同一字符串中查找连续匹配。执行此操作时，搜索从正则表达式的lastIndex属性指定的str的子字符串开始（test（）也将使lastIndex属性前进）。

但是您要替换原始字符串中的每个匹配项，因此，re.exec已设置为零的lastIndex将继续不从头开始搜索，并将省略某些内容。

因此，如果您想在原始字符串中搜索并替换找到的结果，只需省略\g global key：

var render = function(input, data) {
  var re = /<%([^%>]+)?%>/;
  var templateVarArray;
  // var step = "";
  while (!!(templateVarArray = re.exec(input))) {
    var strArray = templateVarArray[1].split(".");
    if (strArray.length == 1)
      input = input.replace(templateVarArray[0], data[templateVarArray[1]]);
    if (strArray.length == 2) {
      input = input.replace(templateVarArray[0], data[strArray[0]][strArray[1]]);
    }
  }
  // return step;
  return input;
}
var input = "<%test.child%><%more%><%name%><%age%>";

document.write(render(input, {
  test: {
    child: "abc"
  },
  more: "MORE",
  name: "ivan",
  age: 22
}));

RegExp无法正常工作

2 个答案: