Java Regex模式不匹配

时间:2017-05-30 19:53:04

标签: java regex

我有一个.txt文件,最后有一个标记“Home”,我想在Home标记之后取所有文本直到文件结束。 但在少数情况下,我有一种情况,在我想要的文本之后,我有几个空行(超过3个)和一些我不需要的文本。 所以我需要在Home标记后取所有文本的正则表达式,但如果有空行3或更多它将停止。 这是产生问题的.txt文件:

Home
"Empty LINE"
    some text that I need some text that I need some text that I need some text that I need some text that I need some text that I need some text that I need some text that I needsome text that I need

"Empty LINE"

"Empty LINE"

"Empty LINE"

"Empty LINE"

"Empty LINE"






some info that I don't need
some info that I don't need
some info that I don't need
some info that I don't need

这是我的代码:

String content = new String(Files.readAllBytes(Paths.get(FILENAME)));

System.out.println(content);
    String pattern = "Home\\s(.*$)";

      // Create a Pattern object
      Pattern r = Pattern.compile(pattern);

      // Now create matcher object.
      Matcher m = r.matcher(content);
      if (m.find( )) {

         System.out.println("Found value: " + m.group(1) );

      }else {
         System.out.println("NO MATCH");
      }

2 个答案:

答案 0 :(得分:0)

要获取所有文本,直到获得三个空行或文件末尾,请尝试:

Home\\s(.*?)(?=\\n{3}|$)
  • Home\s匹配Home字面值,后跟空格\s
  • 随后捕捉群组(.*?)任何角色(非贪婪)
  • 正向前瞻(?=\\n{3}|$)检查后面是否有3个空行\n{3}或文件末尾$

此外,您需要使用DOTALL标记,以便点.也匹配行分隔符。

Pattern.compile(regex, Pattern.DOTALL)

Regex101 Demo

这是一个有效的Java Demo on ideone

答案 1 :(得分:0)

以下正则表达式将会这样做:

"(?s)(?:^|\\R)Home\\R(.*?)(?:\\R{3}|$)"

说明:

  • (?s) - 允许稍后指定的.匹配行终止符(DOTALL标记)。

  • (?:^|\\R) - 匹配文本的开头或行终止符。请注意,使用\R linebreak matcher以便正确匹配Windows行终止符。

  • Home\\R - 匹配文字Home和行终止符。

  • (.*?) - 匹配并capture所需的文字,只要以下匹配模式标识所需文字的结尾(reluctant quantifier)即可结束。

  • (?:\\R{3}|$) - 匹配3行终结符或文本结尾。

测试

Path path = Paths.get("path/to/file.txt");
String text = new String(Files.readAllBytes(path)); // assume default character encoding

Matcher m = Pattern.compile("(?s)(?:^|\\R)Home\\R(.*?)(?:\\R{3}|$)").matcher(text);
if (m.find())
    System.out.printf("'%s'", m.group(1));
else
    System.out.println("** NOT FOUND **");

文本文件是来自问题的文本的复制/粘贴。

输出

'"Empty LINE"
    some text that I need some text that I need some text that I need some text that I need some text that I need some text that I need some text that I need some text that I needsome text that I need

"Empty LINE"

"Empty LINE"

"Empty LINE"

"Empty LINE"

"Empty LINE"'