按日拆分~200mb log4j日志文件

时间:2018-01-03 19:58:27

标签: bash text awk logfile

我有一个格式如下的日志文件,我想在白天将它分成多个文件(即log-2017-10-2,log-2017-10-3等)。我见过人们用awk来做,但是我不知道如何处理堆栈跟踪,因为java.io.Exception是一个新行。有没有方便的方法来实现这个目标?

    2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
    2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
    2017-10-04 04:26:02,544 INFO XXXXXXXXX
    2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
    2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
    2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
    java.io.IOException: Connection to X was disconnected before the response was read
            at XXXXXXXXXXXXXXXXXXXX
            at XXXXXXXXXXXXXXXXXXXX
            at XXXXXXXXXXXXXXXXXXXXX
            at XXXXXXXXXXXXXXXX
            at XXXXXXXXXXXXXXXX
    2017-10-05 04:26:02,549 INFO XXXXXXXXXXX

最终文件内容为:

log-2017-10-2:
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX


log-2017-10-3:
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX

log-2017-10-4:
2017-10-04 04:26:02,544 INFO XXXXXXXXX
    2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
    2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
    2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
    java.io.IOException: Connection to X was disconnected before the response was read
            at XXXXXXXXXXXXXXXXXXXX
            at XXXXXXXXXXXXXXXXXXXX
            at XXXXXXXXXXXXXXXXXXXXX
            at XXXXXXXXXXXXXXXX
            at XXXXXXXXXXXXXXXX

log-2017-10-5:
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX

2 个答案:

答案 0 :(得分:4)

awk救援!

$ awk --posix 'BEGIN{f="log-header"} 
     $1~/^[0-9]{4}-[0-9]{2}-[0-9]{2}$/{f="log-"$1} {print > f}' log

如果日期太多(对应于太多打开的文件),您可能需要在一个点关闭文件。几百个它应该按原样运行。

初始日志文件(日志标题)已设置,以防您的日志不以选中的正则表达式开头。

答案 1 :(得分:2)

awk 解决方案:

awk '/^[0-9]{4}-[0-9]{2}-[0-9]{2} /{ 
         if (fn && !a[$1]++) close(fn);
         fn="log-"$1 
     }{ print > fn }' logfile
  • /^[0-9]{4}-[0-9]{2}-[0-9]{2} / - 遇到以日期字符串开头的行
  • if(fn && !a[$1]++) close(fn) - 关闭上一个“日期”的先前打开的文件描述符
  • fn="log-"$1 - 构建文件名

查看结果:

$ head log-*
==> log-2017-10-02 <==
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX

==> log-2017-10-03 <==
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX

==> log-2017-10-04 <==
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
        &XXXXXXXXXXXXXXXXXXXX
        &XXXXXXXXXXXXXXXXXXXX
        &XXXXXXXXXXXXXXXXXXXXX
        &XXXXXXXXXXXXXXXX
        &XXXXXXXXXXXXXXXX

==> log-2017-10-05 <==
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX