Java Regex调优

时间:2016-02-04 07:12:40

标签: java regex

任何人都可以帮助我做错了吗?

我的示例文字

{[|Name:A|Class:1|Sex:Male|][|Name:B|Class:2|Sex:Female|][|Name:C|Class:3|Sex:Male|]}

预期输出:

|Name:A|Class:1|Sex:Male|
Name:A
Class:1
Sex:Male
|Name:B|Class:2|Sex:Female|
Name:B
Class:2
Sex:Female
|Name:C|Class:3|Sex:Male|
Name:C
Class:3
Sex:Male

当前输出:

|Name:A|Class:1|Sex:Male|
Name:A
Sex:Male
|Name:B|Class:2|Sex:Female|
Name:B
Sex:Female
|Name:C|Class:3|Sex:Male|
Name:C
Sex:Male

我的节目:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Regex {

    public static void main(String[] args) {

        String example = "{[|Name:A|Class:1|Sex:Male|][|Name:B|Class:2|Sex:Female|][|Name:C|Class:3|Sex:Male|]}";

        Pattern curlyBraces = Pattern.compile("\\[(.*?)\\]");

        Matcher m = curlyBraces.matcher(example);
        while (m.find()) {
            System.out.println(m.group(1));
            String element = m.group(1);
            Pattern pipe = Pattern.compile("\\|(.*?)\\|");
            Matcher mPipe = pipe.matcher(element);
            while (mPipe.find()) {
                System.out.println(mPipe.group(1));
            }
        }
    }
}

2 个答案:

答案 0 :(得分:1)

您的问题是"\\|(.*?)\\|"仅匹配行中的|Name:A||Sex:Male|

|Name:A|Class:1|Sex:Male|

因为正则表达式会使用它匹配的字符,因此|Name:A之间的Class:1只能匹配一次。

使用lookaround assertions来解决这个问题 - 他们不会使用他们匹配的文字:

        Pattern pipe = Pattern.compile("(?<=\\|).*?(?=\\|)");
        Matcher mPipe = pipe.matcher(element);
        while (mPipe.find()) {
            System.out.println(mPipe.group(0));
        }

如果你不期望空值,另一种可能性是匹配所有&#34;非管道&#34;字符:

        Pattern pipe = Pattern.compile("[^|]+");
        Matcher mPipe = pipe.matcher(element);
        while (mPipe.find()) {
            System.out.println(mPipe.group(0));
        }

答案 1 :(得分:0)

就像Tim Pietzcker已经描述的那样,|被正则表达式所吸引,然后它就找不到Class:1

但你不需要第二个正则表达式。您可以使用普通string.split("|")代替正则表达式。应该适用于您的情况并且可能更快:

String element = m.group(1);
String[] splitString = element.split("\\|"); // splitString = ["Name:A", "Class:1", "Sex:Male"]

要避免splitString中的空字符串,您必须将第一个正则表达式模式更改为"\\[\\|(.*?)\\|\\]"