正则表达式 - 匹配tomcat日志文件中的条目

时间:2018-04-02 08:27:00

标签: java regex

我正在尝试逐个阅读每个日志条目。所以这是日志文件的一部分:

at java.net.Socket.<init>(Socket.java:434)
    at java.net.Socket.<init>(Socket.java:211)
    at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
    at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
    ... 31 more
26.03.2018 14:43:57,113| INFO http-nio-8080-exec-10 configService==nullLooking up configuration service on rmi://localhost:1199/ConfigService |com.ase.common.utils.ConfigurationServiceUtils
26.03.2018 14:43:57,113| WARN http-nio-8080-exec-10 Could not connect to services. |com.ase.common.utils.ConfigurationServiceUtils
java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: 
    java.net.ConnectException: Connection refused
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
    at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
    at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
    at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:342)
    at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source) 

这是日志模式

  

ConversionPattern =%d {dd.MM.yyyy HH:mm:ss,SSS} | %p%t%m |%c%n

我需要从每个条目中获取所有详细信息,例如日期,优先级值,线程,消息和类。这就是我到目前为止所做的。

(.*?)\| [A-Z]+ (.*?) (.*?) \|(.*)[\S\s]

它匹配所有条目但没有堆栈跟踪。我是如何改进我的正则表达式以获得堆栈跟踪的呢?

所以我需要这样:

Match1 : `26.03.2018 14:43:57,113| INFO http-nio-8080-exec-10 configService==nullLooking up configuration service on rmi://localhost:1199/ConfigService |com.ase.common.utils.ConfigurationServiceUtils` 

Group(1)-> `26.03.2018 14:43:57,113`; Group(2)->`INFO`; Group(3)-> `http-nio-8080-exec-10`; Group(4)->`configService==nullLooking up configuration service on rmi://localhost:1199/ConfigService`; 
Group(5)->`com.ase.common.utils.ConfigurationServiceUtils`

Match2 : `26.03.2018 14:43:57,113| WARN http-nio-8080-exec-10 Could not connect to services. |com.ase.common.utils.ConfigurationServiceUtils
java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: 
    java.net.ConnectException: Connection refused
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
    at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
    at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
    at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:342)
    at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source) ` 

Group(1)-> `26.03.2018 14:43:57,113`; Group(2)->`WARN`; Group(3)-> `http-nio-8080-exec-10`; Group(4)->`Could not connect to services.`; 
Group(5)->`com.ase.common.utils.ConfigurationServiceUtils`; Group(6)->

    java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: 
        java.net.ConnectException: Connection refused...

1 个答案:

答案 0 :(得分:1)

您应该在正则表达式中使用更具体的捕获组,我将指出:

^(\d+\.\d+\.\d{4}[^|]+)\|\s+(\S+)\s+(\S+)\s+([^|]+)\|(\S+)\s+((?:(?!^\d+\.)[^|])*)

故障:

  • ^(\d+\.\d+\.\d{4}[^|]+)\|匹配以日期开头的行(Captuing Group#1)
  • \s+(\S+)匹配空格并捕获非空白字符(CG#2)
  • \s+(\S+)相同(CG#3)
  • \s+([^|]+)\|匹配空格和|(CG#3)
  • 以外的任何内容
  • (\S+)\s+匹配并捕获非空白字符,后跟空格(CG#4)
  • ((?:(?!^\d+\.)[^|])*)钢化模式,检查它是否在应该是下一个匹配开始的行的开头。如果不匹配下一个直接字符(CG#5,可选组)

Live demo