我有一个日志文件,我试图使用sed / awk / grep重新格式化,但在日期格式方面遇到困难。日志如下所示:
1.2.3.4 - - [28/Mar/2019:11:43:58 +0000] "GET /e9bb2dddd28b/5.6.7.8/YL0000000000.rom HTTP/1.1" "-" "Yealink W52P 25.81.0.10 00:00:00:00:00:00" 404 - 1 5 0.146
我想要这样的输出:
Yealink,1.2.3.4,28-03-2019 11:43:58
我尝试了以下方法:
grep Yealink access.log | grep 404 | sed 's/\[//g' | awk '{print "Yealink,",$1,",",strftime("%Y-%m-%d %H:%M:%S", $4)}' | sed 's/, /,/g' | sed 's/ ,/,/g'
编辑-根据评论将日期字符串传递到[
之前删除strftime
-但仍无法按预期运行
但是这将返回一个空日期-显然我的strftime语法错误:
Yealink,1.2.3.4,1970-01-01 01:00:00
答案 0 :(得分:1)
有关strftime的信息,请参见the gawk manual,除了自纪元以来的秒数之外,它不会期望任何格式的时间。如果gawk具有str p time()则可以正常工作,但不能(和I can't persuade the maintainers to provide one)起作用,因此您必须将时间戳转换为mktime()可以转换的格式到秒,然后将THAT传递给strftime(),例如:
$ awk '{
split($4,t,/[[\/:]/)
old = t[4] " " (index("JanFebMarAprMayJunJulAugSepOctNovDec",t[3])+2)/3 " " t[2] " " t[5] " " t[6] " " t[7];
secs = mktime(old)
new = strftime("%d-%m-%Y %T",secs);
print $4 ORS old ORS secs ORS new
}' file
[28/Mar/2019:11:43:58
2019 3 28 11 43 58
1553791438
28-03-2019 11:43:58
但是,您当然根本不需要mktime()或strftime()-只需将日期部分改组即可:
$ awk '{
split($4,t,/[[\/:]/)
new = sprintf("%02d-%02d-%04d %02d:%02d:%02d",t[2],(index("JanFebMarAprMayJunJulAugSepOctNovDec",t[3])+2)/3,t[4],t[5],t[6],t[7])
print $4 ORS new
}' file
[28/Mar/2019:11:43:58
28-03-2019 11:43:58
这将适用于任何awk,而不仅适用于GNU awk,因为它不需要时间函数。
index("JanFebMarAprMayJunJulAugSepOctNovDec",t[3])+2)/3
只是将3个字符的月份名称缩写(例如Mar
)转换为等效月份号(3
)的惯用方式。
答案 1 :(得分:0)
另一个awk,感谢@EdMorton审查了getline的用法。
这里的想法是在awk中使用date
命令,该命令接受缩写的月份
$ date -d"28/Mar/2019:11:43:58 +0000" "+%F %T" # Fails
date: invalid date ‘28/Mar/2019:11:43:58 +0000’
$ date -d"28 Mar 2019:11:43:58 +0000" "+%F %T" # Again fails because of : before time section
date: invalid date ‘28 Mar 2019:11:43:58 +0000’
$ date -d"28 Mar 2019 11:43:58 +0000" "+%F %T" # date command works but incorrect time because of + in the zone
2019-03-28 17:13:58
$ date -d"28 Mar 2019 11:43:58" "+%F %T" # correct value after stripping +0000
2019-03-28 11:43:58
$
结果
awk -F"[][]" -v OFS=, '/Yealink/ {
split($1,a," "); #Format $1 to get IP
gsub("/", " ",$2); sub(":"," ",$2); sub("\\+[0-9]+","",$2); # Massage to get data value
cmd = "date -d\047" $2 "\047 \047+%F %T\047"; if ( (cmd | getline line) > 0 ) $2=line; close(cmd) # use system date
print "Yealink",a[1],$2
} ' access.log
下面是文件内容
$ cat access.log
1.2.3.4 - - [28/Mar/2019:11:43:58 +0000] "GET /e9bb2dddd28b/5.6.7.8/YL0000000000.rom HTTP/1.1" "-" "Yealink W52P 25.81.0.10 00:00:00:00:00:00" 404 - 1 5 0.146
$