在文本文件上多次匹配模式?

时间:2013-04-09 17:10:41

标签: java regex string matcher

问题:

在Java中,我将解析脚本并从文件中提取以

开头的任何文本
    ${GLOBAL_} or ${AUTO_}

我希望在最后一个大括号之前拾取所有内容,例如,如果我有以下字符串:

    "this is a String ${AUTO_TEST_f} body ${GLOBAL_SYNC} ${AUTO_2} ${OTHER_VAR}"

结果应该是:

${AUTO_TEST_f} 

${GLOBAL_SYNC}

${AUTO_2} 

我尝试过的事情:

我试图创建正则表达式模式(我相信它可以工作)并使用它们来创建匹配器。然后我尝试使用Matcher将所有匹配打印到控制台,但我遇到了一些问题。出于某种原因,它正在跳过$ {GLOBAL_VARIABLE_1}。另外,我怎么能实现这个给我所有的比赛?一个while循环[while match.group(0)!= null]?

这是我的代码:

 String re1="(\\$)";    // Any Single Character 1
String re2="(\\{)"; // Any Single Character 2
String re3="(G)";   // Any Single Character 3
String re4="(L)";   // Any Single Character 4
String re5="(O)";   // Any Single Character 5
String re6="(B)";   // Any Single Character 6
String re7="(A)";   // Any Single Character 7
String re8="(L)";   // Any Single Character 8
String re9="(_)";   // Any Single Character 9
String re10="(.*?)";    // Any Single Character 10
String re11="(\\})";    // Any Single Character 11


String r1="(\\$)";  // Any Single Character 1
String r2="(\\{)";  // Any Single Character 2
String r3="(A)";    // Any Single Character 3
String r4="(U)";    // Any Single Character 4
String r5="(T)";    // Any Single Character 5
String r6="(O)";    // Any Single Character 6
String r7="(_)";    // Any Single Character 7
String r8="(.*?)";  // Any Single Character 8
String r9="(\\})";  // Any Single Character 9

Pattern p = Pattern.compile((re1+re2+re3+re4+re5+re6+re7+re8+re9+re10+re11));
Pattern p2 = Pattern.compile(r1+r2+r3+r4+r5+r6+r7+r8+r9);

 Matcher m = p.matcher(txt);
 Matcher m1 = p2.matcher(txt);
 m1.find();
 System.out.println(m1.group(0));
 m.find();
 System.out.println(m.group(0));

这是控制台结果:

 Actual Results:
 ${AUTO_1}
 ${GLOBAL_VARIABLE_2}

以下是我的预期结果:

 Expected Results:
 ${GLOBAL_VARIABLE_1}
 ${AUTO_1}
 ${GLOBAL_VARIABLE_2}
 ${GLOBAL_VARIABLE_3}

谢谢!

3 个答案:

答案 0 :(得分:4)

不要过度复杂化:

String txt = "this is a String ${AUTO_TEST_f} body ${GLOBAL_SYNC} ${AUTO_2}";
String regex = "\\$\\{(AUTO|GLOBAL)_(.*?)\\}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(txt);
while (matcher.find()) {
    System.out.println(matcher.group() + "\t->\t" + matcher.group(2) + "\t(" + matcher.group(1) + ")" );
}

输出:

${AUTO_TEST_f}  ->  TEST_f  (AUTO)
${GLOBAL_SYNC}  ->  SYNC    (GLOBAL)
${AUTO_2}       ->  2       (AUTO)

答案 1 :(得分:1)

试试这个:

String data = "this is a String ${AUTO_TEST_f} body ${GLOBAL_SYNC} ${AUTO_2}";
    Pattern pattern = Pattern.compile("\\$\\{.+?\\}");

    Matcher matcher = pattern.matcher(data);

    while (matcher.find()) {
        // Indicates match is found. Do further processing
        System.out.println(matcher.group());
    }

输出是:

$ {AUTO_TEST_f}
$ {} GLOBAL_SYNC
$ {AUTO_2}

答案 2 :(得分:1)

无论你做什么,都不是正确的编码方式。从较小的组件编写正则表达式很好,但是当你将组件分解为单个字符时没有意义。

如果您希望获得以GLOBALAUTO开头的内容,则只需:

\$\{(GLOBAL|AUTO)_.*?\}

将正则表达式放入字符串文字中:

"\\$\\{(GLOBAL|AUTO)_.*?\\}"