Java正则表达式以匹配多行

时间:2019-05-31 04:38:59

标签: java regex pattern-matching

以下是应在其上应用正则表达式的数据的示例:

2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Filter -> Map (1/1) (824780055001546646d35df7a64cfe3c) switched from CANCELING to CANCELED.
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Try to restart or fail the job  (3064130e1dccead0b037f193d3699c3b) if no longer possible.
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Could not restart the job  (3064130e1dccead0b037f193d3699c3b) because the restart strategy prevented it.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 3064130e1dccead0b037f193d3699c3b.
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - Shutting down
2019-05-27 10:49:18,419 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Job 3064130e1dccead0b037f193d3699c3b reached globally terminal state FAILED.

基本上我要提取的是时间戳和错误消息:

对于实例:

TimeStamp               Error
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)

这里错误消息被分成多行,为此,我编写了如下的Java模式:

((?m)\\d{4}-[01]\\d-[0-3]\\d\\s[0-2]\\d((:[0-5]\\d)?){2}[\\s\\S]*ERROR[\\s\\S]*[ ]*at [\\s\\S]*)

但是它将返回文件的所有内容。

我应该怎么做才能使其工作,以便它也能给出多行错误消息。

2 个答案:

答案 0 :(得分:1)

尝试

((\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5})\sERROR.+?(?=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5}))

外植:

  • (\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5})-匹配时间戳记
  • \sERROR.+?(?=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5})-进行非贪婪匹配,直到找到下一个时间戳(正向超前)
  • 我还要强调一点,在使用此正则表达式时,您必须使用m选项进行多行匹配
  • 此匹配项将为您提供[[log, timestamp],[log, timestamp]]之类的每个匹配项的嵌套组

答案 1 :(得分:0)

您的模式看起来不对,并且您还应该在点全部模式下使用模式,因为要捕获的堆栈跟踪部分可能跨越一行而不是一行。我建议使用以下正则表达式模式:





    /**** fragment ****/
    val groups: MutableList<Item> = ArrayList<Item>()

    val childs : MutableList<ItemDetail> = ArrayList<ItemDetail>()
    childs.add(ItemDetail("200x200",4))
    childs.add(ItemDetail("300x300",6))
    childs.add(ItemDetail("400x300",2))
    groups.add(Item("Item ISFAKHAN",childs))

    val childs2 : MutableList<ItemDetail> = ArrayList<ItemDetail>()
    childs2.add(ItemDetail("200x200",42))
    childs2.add(ItemDetail("300x300",62))
    childs2.add(ItemDetail("400x300",22))
    groups.add(Item("Item DASTAN",childs2))
    /***  error ***/
    cAdapter = CartAdapter(groups)
    /*** Type mismatch.Required: MutableList<out ExpandableGroup<Parcelable>>? Found: MutableList<Item> ***/



      /****** adapter *****/
    class CartAdapter(groups: MutableList<out ExpandableGroup<Parcelable>>?) : 
    ExpandableRecyclerViewAdapter<ItemVH, ItemDetailVH>(groups) {

}

这与时间戳匹配,后跟\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} ERROR.*?(?=\bat\b) ,然后匹配所有内容,直到到达第一个ERROR

这是一个有效的测试脚本:

at

输出:

String input = "2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.\njava.lang.IllegalArgumentException: json can not be null or empty\n    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)";
String pattern = "\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3} ERROR.*?(?=\\bat\\b)";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(input);
if (m.find()) {
    System.out.println(m.group(0));
}