提取键/值对,值可以跨越

时间:2017-05-12 19:53:45

标签: java

输入文件:

key1=1
key2=start(a
b
c=
d)end
key3=d=e=f
somekey=start(123)end
morekey=start(1
2)end
key=jj

输出

key1    -> 1
key2    -> a
           b
           c=
           d
key3    -> d=e=f
somekey -> 123
morekey -> 1
           2
key     -> jj

请求:尝试使用java。不能使用java.util.Properties,正则表达式很好但不是首选,更喜欢StringUtils.substringBetween,但正则表达式会这样做。如何遍历多行并保留换行符。 以下显然不适用于多线。打算尝试正则表达式,但前提是不可能采用更优雅的方式。

    String[] str = line.split("=", 2);
    StringUtils.substringBetween(line,startString,endString)); 

3 个答案:

答案 0 :(得分:1)

你的意思是这样的:

String str = "key1=1\n"
        + "key2=start(a\n"
        + "b\n"
        + "c=\n"
        + "d)end\n"
        + "key3=d=e=f\n"
        + "somekey=start(123)end\n"
        + "morekey=start(1\n"
        + "2)end\n"
        + "key=jj";
System.out.println(str.replaceAll("start\\(|\\)end", "")
        .replaceAll("(\\w{2})=", "$1\t-> ")
        .replaceAll("(\n\\w)", "\t$1"));

答案 1 :(得分:0)

解决此问题的一种方法是编写自己的解析器。例如:

public static final String START = "start(";
public static final String END = ")end";

// ...

Scanner scanner = new Scanner(
        "key1=1\n" +
        "key2=start(a\n" +
        "b\n" +
        "c=\n" +
        "d)end\n" +
        "key3=d=e=f\n" +
        "somekey=start(123)end\n" +
        "morekey=start(1\n" +
        "2)end\n" +
        "key=jj");

Map<String, String> map = new HashMap<>();
while (scanner.hasNext()) {
    String line = scanner.nextLine();
    int eq = line.indexOf('=');
    String key = line.substring(0, eq);
    String value = line.substring(eq + 1);
    if (value.startsWith(START)) {
        StringBuilder sb = new StringBuilder(value.substring(START.length()));
        while (!value.endsWith(END)) {
            value = scanner.nextLine();
            sb.append('\n').append(value);
        }
        value = sb.substring(0, sb.length() - END.length());
    }
    map.put(key, value);
}

for (Map.Entry<String, String> entry : map.entrySet()) {
    System.out.printf("%s -> %s\n", entry.getKey(), entry.getValue());
}

答案 2 :(得分:0)

以下正则表达式可以找到所有键/值对:

(?ms)^(\w+)=(?:start\((.*?)\)end|(.*?))$

密钥将位于捕获组1中,该值将位于捕获组2或3中。

测试

String input = "key1=1\r\n" +
               "key2=start(a\r\n" +
               "b\r\n" +
               "c=\r\n" +
               "d)end\r\n" +
               "key3=d=e=f\r\n" +
               "somekey=start(123)end\r\n" +
               "morekey=start(1\r\n" +
               "2)end\r\n" +
               "key=jj\r\n";

String regex = "(?ms)^(\\w+)=(?:start\\((.*?)\\)end|(.*?))$";

Map<String, String> map = new HashMap<>();
for (Matcher m = Pattern.compile(regex).matcher(input); m.find(); )
    map.put(m.group(1), (m.start(2) != -1 ? m.group(2) : m.group(3)));

for (Entry<String, String> e : map.entrySet())
    System.out.printf("%-7s -> %s%n", e.getKey(),
                      e.getValue().replaceAll("(\\R)", "$1           "));

输出

key1    -> 1
key2    -> a
           b
           c=
           d
key3    -> d=e=f
somekey -> 123
morekey -> 1
           2
key     -> jj