正则表达式(可选)多行java堆栈跟踪

时间:2013-12-16 10:55:08

标签: python regex

这是我关于堆栈溢出的第一个问题。我已经在堆栈溢出时找到了很多答案,但没有找到我遇到的这个问题。 (p.s.感谢你们到目前为止所有的好答案!)

我正在建立一个日志文件解析器,它将所有东西都喷到graylog上,但是我很难匹配多行堆栈跟踪(或多行INFO消息)。

现在我有这段python代码:

import re
pattern = re.compile(r'^\[(\d{2})/(\d{2})/(\d{4}) (\d{2}):(\d{2}):(\d{2})\] MW(\d*) P(\d*) PR(\d*)\] (.{5}) - (.*) - (.*)', re.MULTILINE)
colnames = ('day','month','year','hour','minute','second','medewerkerid','patientid','praktijkid','level','logger','short_message')
file = open('file.log.2013-12-03','r')
for line in file:
    match = pattern.match(line)
    if match:
            for item in match.groups():
                    print item
            print

日志文件如下所示:

[03/12/2013 00:20:09] MW310148720 P316855786 PR306788004] WARN  - o.h.hql.ast.QueryTranslatorImpl - firstResult/maxResults specified with collection fetch; applying in memory!
[03/12/2013 00:20:09] MW310148720 P316855786 PR306788004] WARN  - o.h.hql.ast.QueryTranslatorImpl - firstResult/maxResults specified with collection fetch; applying in memory!
[03/12/2013 00:20:09] MW310148720 P316855786 PR306788004] INFO  - n.p.a.w.c.agenda.AgendaKalenderCtrl - AgendaKalenderCtrl.perform(...) duurde 52 ms
[03/12/2013 00:20:22] MW310148720 P316855786 PR306788004] WARN  - o.h.hql.ast.QueryTranslatorImpl - firstResult/maxResults specified with collection fetch; applying in memory!
[03/12/2013 00:20:22] MW310148720 P316855786 PR306788004] WARN  - o.h.hql.ast.QueryTranslatorImpl - firstResult/maxResults specified with collection fetch; applying in memory!
[03/12/2013 00:20:22] MW310148720 P316855786 PR306788004] INFO  - n.p.a.w.c.agenda.AgendaKalenderCtrl - AgendaKalenderCtrl.perform(...) duurde 47 ms
[03/12/2013 00:22:18] MW310148720 P316855786 PR306788004] INFO  - nl.xxxxxxxx.authentication - Subject Subject:
    Principal: ActionPrincipal: agenda
    Principal: UserPrincipal: Ween
    Principal: ActionPrincipal: medicatievoorschrijven
    Principal: ActionPrincipal: versturenbbberichten
    Principal: ActionPrincipal: standaardvoorschriftvastleggen
    Principal: ActionPrincipal: facturatie
    Principal: ActionPrincipal: medischdossier
    Principal: ApplicationPrincipal: his
    Principal: ActionPrincipal: altijdherhalen
    Principal: ActionPrincipal: onderhoudpatienten
    Principal: ActionPrincipal: medicatieauthoriseren
    Principal: ActionPrincipal: zoekenpassant
    Principal: ActionPrincipal: rapportage
    Principal: WEB_BROWSER_CHROME
 is afgemeld
[03/12/2013 00:22:18] MW310148720 P316855786 PR306788004] INFO  - nl.xxxxxxxx.application - [LOGOFF]           User 'Ween' logged off.
[03/12/2013 06:40:59] MW310155226 P PR301914008] WARN  - n.p.a.b.a.jndi.JndiResourcesHelper - Getting 'threadPoolTimeout' from JNDI context failed. Using default value: 900
[03/12/2013 06:41:10] MW310155226 P PR301914008] WARN  - o.h.hql.ast.QueryTranslatorImpl - firstResult/maxResults specified with collection fetch; applying in memory!
[03/12/2013 06:41:10] MW310155226 P PR301914008] WARN  - o.h.hql.ast.QueryTranslatorImpl - firstResult/maxResults specified with collection fetch; applying in memory!
[03/12/2013 06:41:10] MW310155226 P PR301914008] INFO  - n.p.a.w.c.agenda.AgendaKalenderCtrl - AgendaKalenderCtrl.perform(...) duurde 33 ms

所有单个日志行都可以正常工作。除了多线外,Eveything被分成适当数量的组。我希望多行消息成为另外一个组的一部分(它将被命名为full_message)。

我已经尝试了很多东西,但我的正则表达知识并不是很好。

有人可以提出建议吗?

Bassically我想要的是在任何一行的末尾,如果下一行不以一个括号([)开头,那么整个行在1组和所有以下行中,直到我们到达括号([ )。

如果该行不匹配,我可以用其他方法做,但我想知道它是否也可以用1个正则表达式行完成。

0 个答案:

没有答案