正则表达式在嵌套括号内提取单词

时间:2018-03-23 09:45:21

标签: regex google-apps-script

我正在寻找能够完成这项任务的正则表达式

消息正文输入:Test1 (Test2) (test3) (ti,ab(text(text here(possible text)text(possible text(more text))))) end (text)

我想要结果的结果:(text(text here(possible text)text(possible text(more text))))

我想收集ti,ab(................)

内的所有内容
var messageBody = message.getPlainBody()
var ssFile = DriveApp.getFileById(id);
DriveApp.getFolderById(folder.getId()).addFile(ssFile);
var ss = SpreadsheetApp.open(ssFile);
var sheet = ss.getSheets()[0];
sheet.insertColumnAfter(sheet.getLastColumn());
SpreadsheetApp.flush();
var sheet = ss.getSheets()[0];
var range = sheet.getRange(1, 1, sheet.getLastRow(), sheet.getLastColumn() + 1)                            
var values = range.getValues();

values[0][sheet.getLastColumn()] = "Search Strategy";

 for (var i = 1; i < values.length; i++) {                          
                             //here my Regexp 
                            var y = messageBody.match(/\((ti,ab.*)\)/ig);
                            if (y);        
                            values[i][values[i].length - 1] = y.toString(); 


                            range.setValues(values);

1 个答案:

答案 0 :(得分:2)

您可以在此处使用的唯一解决方案是提取括号内的所有子字符串,然后对其进行过滤以获取以ti,ab开头的所有子字符串:

&#13;
&#13;
var a = [], r = [], result;
var txt = "Test1  (Test2) (test3) (ti,ab(text(text here(possible text)text(possible text(more text))))) end (text)";
for(var i=0; i < txt.length; i++){
    if(txt.charAt(i) == '(') {
        a.push(i);
    }
    if(txt.charAt(i) == ')') {
        r.push(txt.substring(a.pop()+1,i));
    }
}
result = r.filter(function(x) { return /^ti,ab\(/.test(x); })
          .map(function(y) {return y.substring(6,y.length-1);})
console.log(result);
&#13;
&#13;
&#13;

嵌套括号函数来自Nested parentheses get string one by one/^ti,ab\(/正则表达式匹配字符串开头的ti,ab(

上述解决方案允许在嵌套括号内提取嵌套括号。如果您不需要,请使用

&#13;
&#13;
var txt = "Test1 (Test2) ((ti,ab(text(text here))) AND ab(test3) Near Ti(test4) NOT ti,ab,su(test5) NOT su(Test6))";
var start=0, r = [], level=0;
for (var j = 0; j < txt.length; j++) {
  if (txt.charAt(j) == '(') {
    if (level === 0) start=j;
    ++level;
  }
  if (txt.charAt(j) == ')') {
     
    if (level > 0) {
    		--level;
    }
    if (level === 0) {
    	r.push(txt.substring(start, j+1));
    }
  }
}
console.log("r: ", r);
var rx = "\\b(?:ti|ab|su)(?:,(ti|ab|su))*\\(";
var result = r.filter(function(y) { return new RegExp(rx, "i").test(y); })
	.map(function(x) {
  	return x.replace(new RegExp(rx, "ig"), '(') 
  });
console.log("Result:",result);
&#13;
&#13;
&#13;

用于过滤和删除不必要的单词的模式

\b(?:ti|ab|su)(?:,(ti|ab|su))*\(

<强>详情

  • \b - 字边界
  • (?:ti|ab|su) - 其中一个替代方案,
  • (?:,(ti|ab|su))* - 重复,次0次,然后是3次替换中的1次
  • \( - (

匹配将替换为(以在匹配中恢复。