在日志文件中的两个日期之间awk - 几乎正常工作

时间:2014-02-13 17:12:08

标签: awk grep csh

我有一个csh脚本,它试图在两个日期之间的日志文件中识别条目

(在脚本中,它们是$ start_date,$ end_date输入为DD / MM / YYYY,但我在这里简化了)

more text_B_14_FEB_03.dt | grep TMYO 

TMYO140043J:=TMYO140043J     P33BJm SOLO            03/02/2014 
TMYO140044J:=TMYO140044J     P4m    FINL            03/02/2014 
TMYO140044M:=TMYO140044M     P3BJ   FINL            03/02/2014 
TMYO140045M:=TMYO140045M     P33BJq MARS            04/02/2014 
TMYO140046M:=TMYO140046M     P33BJq RENN            04/02/2014 
TMYO140047M:=TMYO140047M     P33BJl AKHT            05/02/2014 
TMYO140048M:=TMYO140048M     P3l    MACL            05/02/2014 
TMYO140049M:=TMYO140049M     P3q    HAYE            06/02/2014 
TMYO140050M:=TMYO140050M     P3q    ROCH            06/02/2014 
TMYO140051M:=TMYO140051M     P3q    FORR            06/02/2014 
TMYO140052L:=TMYO140052L     P3v    ROSE            07/02/2014 
TMYO140053L:=TMYO140053L     P3v    CAIR            07/02/2014 
TMYO140054L:=TMYO140054L     P3v    MURR            07/02/2014 

我尝试了以下但是它无法正确处理上一年的日期?

more text_B_14_FEB_03.dt | grep TMYO | awk '$5>="02/01/2013" && $5<="13/02/2014"'

TMYO140043J:=TMYO140043J     P33BJm SOLO            03/02/2014 
TMYO140044J:=TMYO140044J     P4m    FINL            03/02/2014 
TMYO140044M:=TMYO140044M     P3BJ   FINL            03/02/2014 
TMYO140045M:=TMYO140045M     P33BJq MARS            04/02/2014 
TMYO140046M:=TMYO140046M     P33BJq RENN            04/02/2014 
TMYO140047M:=TMYO140047M     P33BJl AKHT            05/02/2014 
TMYO140048M:=TMYO140048M     P3l    MACL            05/02/2014 
TMYO140049M:=TMYO140049M     P3q    HAYE            06/02/2014 
TMYO140050M:=TMYO140050M     P3q    ROCH            06/02/2014 
TMYO140051M:=TMYO140051M     P3q    FORR            06/02/2014 
TMYO140052L:=TMYO140052L     P3v    ROSE            07/02/2014 
TMYO140053L:=TMYO140053L     P3v    CAIR            07/02/2014 
TMYO140054L:=TMYO140054L     P3v    MURR            07/02/2014   

这里错误地错过了从03/02/2014开始日期到04/01/2013的日期条目?

more text_B_14_FEB_03.dt | grep TMYO | awk '$5>="04/01/2013" && $5<="13/02/2014"'

TMYO140045M:=TMYO140045M     P33BJq MARS            04/02/2014 
TMYO140046M:=TMYO140046M     P33BJq RENN            04/02/2014 
TMYO140047M:=TMYO140047M     P33BJl AKHT            05/02/2014 
TMYO140048M:=TMYO140048M     P3l    MACL            05/02/2014 
TMYO140049M:=TMYO140049M     P3q    HAYE            06/02/2014 
TMYO140050M:=TMYO140050M     P3q    ROCH            06/02/2014 
TMYO140051M:=TMYO140051M     P3q    FORR            06/02/2014 
TMYO140052L:=TMYO140052L     P3v    ROSE            07/02/2014 
TMYO140053L:=TMYO140053L     P3v    CAIR            07/02/2014 
TMYO140054L:=TMYO140054L     P3v    MURR            07/02/2014 

知道awk部分哪里出错了?我很感激perl可能是最灵活的答案,但我的perk脚本还没有,我想先用awk解决这个问题。

2 个答案:

答案 0 :(得分:4)

您需要将日期解析为秒并进行比较。您必须使用mktime()函数,该函数接受包含日期各部分的字符串,因此您必须首先split()。这个程序很奇怪,因为它有很多重复的代码,但它似乎有效,我希望你能得到这个想法:

awk '
    BEGIN { 
        date1 = "04/02/2014"
        split(date1, arr, "/")
        seconds1 = mktime(arr[3] " " arr[2] " " arr[1] " 0 0 0") 

        date2 = "06/02/2014"
        split(date2, arr, "/")
        seconds2 = mktime(arr[3] " " arr[2] " " arr[1] " 0 0 0")
    }

    {
        split($NF, arr, "/")
        s = mktime(arr[3] " " arr[2] " " arr[1] " 0 0 0")

        if (s >= seconds1 && s <= seconds2) {
            print $0
        }
    }
' infile

使用您的第二个示例数据,它会产生:

TMYO140045M:=TMYO140045M     P33BJq MARS            04/02/2014 
TMYO140046M:=TMYO140046M     P33BJq RENN            04/02/2014 
TMYO140047M:=TMYO140047M     P33BJl AKHT            05/02/2014 
TMYO140048M:=TMYO140048M     P3l    MACL            05/02/2014 
TMYO140049M:=TMYO140049M     P3q    HAYE            06/02/2014 
TMYO140050M:=TMYO140050M     P3q    ROCH            06/02/2014 
TMYO140051M:=TMYO140051M     P3q    FORR            06/02/2014 

答案 1 :(得分:2)

您应该将日期转换为YYYYMMDD格式,以便可以按字典顺序排序。您可以使用gawkregex执行此操作,也可以使用awk执行子字符串操作。这是gawk方式

more text_B_14_FEB_03.dt | grep TMYO | gawk 'match($5, "([0-9]+)/([0-9]+)/([0-9]+)", ary) {B
=ary[3] ary[2] ary[1]; if (B < 20140213 && B> 20130104) print }'