对于正则表达式专家,分为 - >并避免括号/引号内容

时间:2013-02-27 05:26:09

标签: javascript regex

给定字符串:

  

搞笑 - > A_gre $“[” - >看起来 - > 很棒/ * 54 [[搞笑] - > [“ - > [很棒 - > yolo] - >看起来]] [很棒] - > a2afg34423 * / - - > yolo“ - > [”

拆分为数组:

  1. 滑稽
  2. A_gre $'['at
  3. 长相
  4. 很棒/ * 54 [[搞笑] - > [' - > [很棒 - > yolo] - >看起来] [大]
  5. a2afg34423 * / -
  6. yolo' - > ['
  7. 正则表达式解决方案??!

    基本上,如果括号周围有引号,请避免使用括号作为打开/关闭分隔符,否则请确保打开括号和关闭括号之间的文本无效。我如何使用Regex实现这一目标?

    我的解析器解决方案 Test

    var s = "funny -> A_gre$' [ 'at -> looks -> great/\*54[ [funny ' -> [ ' ->"
            + "[great -> yolo] -> looks]][great] -> a2afg34423*/- -> yolo' -> [ '",
        p = 0,
        z = [0],
        q = 0,
        x = s.split('');
    
    //Looking for \" not \'
    for(var i = 0; i< x.length; i++){
       var b = x[i],
           c = x[i + 1],
           q = b == "'" ? ++q : q,
           p =  !(q % 2) ? b == '[' ? ++p : b == ']' ? --p : p : p;
    
       if(b == '-' && c == '>' && !p && !(q % 2))
           z.push(i + 2);
       if(i == x.length - 1){
           z.push(x.length); x = [];
           for(var u = 0; u < z.length; u++)
             z[u + 1] !== undefined ?
                  x.push(s.substring(z[u], z[u + 2] !== undefined ?
                          z[u + 1]-2 : z[u + 1]).trim()) : 0;
        }
    }
    
    console.log(x)
    

    输出继电器:

    ->>> [
          "funny", 
          "A_gre$' [ 'at", 
          "looks", 
          "great/*54[ [funny ' -> [ ' -> [great -> yolo] -> looks]][great]", 
          "a2afg34423*/-", 
          "yolo' -> [ '"
         ]
    

1 个答案:

答案 0 :(得分:2)

尝试这种模式:

([^\s\[\"]*\[[^\]]+\])\S*|([^\s\[\"]*\"[^\"]+\")\S*|(\w\S*)

使用regexpal查看匹配内容。它由三部分组成。其中一个的描述如下:

([^\s\[\"]*\[[^\]]+\])\S*

读取非空格,非引号和非括号字符的表达式,直到它到达一个开括号,然后读取括号内容直到它到达结束括号,然后读取后面出现的任何非空格字符。这是括号匹配部分的更详细描述:

\[         : opening bracket character
[          : regex syntax for starting a set definition
   ^       : It's a negative set, i.e., set of characters which are NOT:
   \]      : closing bracket character
]+         : regex syntax for ending a set definition and the + operator for matching 1 or more occurrences
\]         : closing bracket character

另一部分处理引号,另一部分匹配没有括号和引号的单词。

以下代码显示了如何查看匹配项以及如何提取它们:

var input ='funny - &gt; A_gre $“[”at - &gt;看起来 - &gt;伟大的/ 54 [[有趣的] - &gt; [“ - &gt; [很棒 - &gt; yolo] - &gt;看起来]] [很棒] - &gt; a2afg34423 / - - &gt; yolo“ - &gt; [”'

var regexp = /([^\s\[\"]*\[[^\]]+\])\S*|([^\s\[\"]*\"[^\"]+\")\S*|(\w\S*)/g;

var result = input.match(regexp)
console.log("Array of matches are:");
console.log(result);

var results = regexp.exec(input);
while(results != null) {
    console.log("index: " + results.index + " found: " + results[0]);
    results = regexp.exec(input);
} 

这可以在这里看到:http://jsfiddle.net/LXqch/1/