希望有人可以帮助我使用bash linux脚本从http日志生成报告。
日志格式:
domain.com 101.100.144.34 - r.c.bob [14/Feb/2017:11:31:20 +1100] "POST /webmail/json HTTP/1.1" 200 1883 "https://example.domain.com/webmail/index-rui.jsp?v=1479958955287" "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko" 1588 2566 "110.100.34.39" 9FC1CC8A6735D43EF75892667C08F9CE 84670 - - - -
输出要求:
time in epoch,host,Resp Code,count
1485129842,101.100.144.34,200,4000
1485129842,101.101.144.34,404,1889
我到目前为止所做的事情,但没有接近我想要实现的目标:
tail -100 httpd_access_*.log | awk '{print $5 " " $2 " " $10}' | sort | uniq
答案 0 :(得分:0)
awk 'BEGIN{
# print header
print "time in epoch,host,Resp Code,count"
# prepare month conversion array
split( "Jan Feb Mar Apr May Jun Jui Aug Sep Oct Nov Dec", tmp)
for (i in tmp) M[tmp[i]]=i
}
{
#prepare time conversion for mktime() using array and substitution
# from 14/Feb/2017:11:31:20 +1100
# to YYYY MM DD HH MM SS [DST]
split( $5, aT, /[:/[:blank:]]/)
t = $5; sub( /^.*:|:/, " ", t)
t = aT[3] " " M[aT[2]] " " aT[1] t
# count (not clear if it s this to count due to time changing
Count[ sprintf( "%s, %s, %s", mktime( t), $2, $10)]++
}
END{
# disply the result counted
for( e in Count) printf( "%s, %d\n", e, Count[e])
}
' httpd_access_*.log
答案 1 :(得分:0)
确保上面基于纯AWK的解决方案会更快,更完整。 但也可以用更小的步骤完成:
首先获取日期并将其转换为EPOCH:
$ dt=$(awk '{print $5,$6}' file.log)
$ ep=$(date -d "$(sed -e 's,/,-,g' -e 's,:, ,' <<<"${dt:1:-1}")" +"%s")
$ echo "$ep"
1487032280
由于现在你在bash var $ ep中有了纪元日期,你可以继续这样的初始awk:
$ awk -v edt=$ep '{print edt","$2","$10}' file.log
1487032280,101.100.144.34,200
如果你想要一个标题,你可以在最后一个awk之前用一个简单的回声打印一个。