文件中的日期时间之间的差异

时间:2014-11-03 12:27:04

标签: shell awk

我有一个巨大的日志文件:

202.32.92.47,01/Jun/1995:00:00:59,/~scottp/publish.html,200,271
ix-or7-27.ix.netcom.com,01/Jun/1995:00:02:51,/~ladd/ostriches.html,200,205908
...

我需要计算从第一行到当前行之间的两行之间的差异。第二列的格式如下:

dd/month/year:HH:MM:SS

我可以使用命令在vim中更改它:

:%s/\/Jun\//\:Jun\:/g

然后我得到:

fromkin.lib.uwm.edu,01:Jun:1995:11:58:03,/~scottp/publish.html,200,271
slip1.ac.brocku.ca,01:Jun:1995:11:58:03,/cgi-bin/hytelnet?file=DIR000,200,7748
bertram.hallf.lth.se,01:Jun:1995:11:58:06,/~macphed/finite/fe_resources/node92.html,200,1668

格式:

dd:month:year:HH:MM:SS

有没有办法在shell脚本/ awk中执行此操作?

我期待的输出是:

fromkin.lib.uwm.edu,01:Jun:1995:11:58:03,/~scottp/publish.html,200,271
slip1.ac.brocku.ca,0,/cgi-bin/hytelnet?file=DIR000,200,7748
bertram.hallf.lth.se,3,/~macphed/finite/fe_resources/node92.html,200,1668

1 个答案:

答案 0 :(得分:1)

目前尚不清楚您的预期输出应该是什么,因为您发布的样本输出与您发布的输入不匹配,但在发布的样本输入文件中差异2个时间戳并打印第一个和所有时间戳之间的秒数后续行将是(使用GNU awk作为时间函数):

$ cat file
202.32.92.47,01/Jun/1995:00:00:59,/~scottp/publish.html,200,271
ix-or7-27.ix.netcom.com,01/Jun/1995:00:02:51,/~ladd/ostriches.html,200,205908
fromkin.lib.uwm.edu,01/Jun/1995:11:58:03,/~scottp/publish.html,200,271
slip1.ac.brocku.ca,01/Jun/1995:11:58:03,/cgi-bin/hytelnet?file=DIR000,200,7748
bertram.hallf.lth.se,01/Jun/1995:11:58:06,/~macphed/finite/fe_resources/node92.html,200,1668

$ cat tst.awk
BEGIN{ FS=OFS="," }
{
    split($2,t,/[\/:]/)
    mthNr = (match("JanFebMarAprMayJunJulAugSepOctNovDec",t[2])+2)/3
    currSecs = mktime(t[3]" "mthNr" "t[1]" "t[4]" "t[5]" "t[6])

    if (NR == 1) {
        baseSecs = currSecs
    }
    else {
        $2 = currSecs - baseSecs
    }
    print
}

$ awk -f tst.awk file
202.32.92.47,01/Jun/1995:00:00:59,/~scottp/publish.html,200,271
ix-or7-27.ix.netcom.com,112,/~ladd/ostriches.html,200,205908
fromkin.lib.uwm.edu,43024,/~scottp/publish.html,200,271
slip1.ac.brocku.ca,43024,/cgi-bin/hytelnet?file=DIR000,200,7748
bertram.hallf.lth.se,43027,/~macphed/finite/fe_resources/node92.html,200,1668