我有tomcat访问日志,条目如下:
50.47.142.25 - - [07/May/2012:00:00:14 +0000] 0 "GET /mywebpage/blah.jsp " 200 123 "-" "-"
我希望将所有条目放在SQL表中,然后对其运行SQL查询。
我正在考虑使用GAWK(gnu AWK)来获取CSV格式的所有行。类似的东西:
gawk '{print $1 ", " $2 ", " , " $3 ", " $4 ", " $5 ", " $6 ", " $7 ", " $8 ", " $9 ", " $9}'
给了我
50.47.142.25, -, -, [11/May/2012:08:51:02, 0, "GET /mywebpage/blah.jsp" 200, 123, -, -
让我接近SQL插入语句。除此之外,我需要使用以下格式的日期时间:
2012-05-11 08:51:02
即没有前导方括号和SQLServer希望它的格式。任何提示?
答案 0 :(得分:3)
#!/usr/bin/awk -f
BEGIN {
monthlist = "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"
c = split(monthlist, monthsarr)
for (i = 1; i <= c; i++) {
months[monthsarr[i]] = i
}
fieldlist = "1 2 3 5 8 10 11 14 15 17 20"
fieldcount = split(fieldlist, fields)
OFS = ","
}
{
delim = ""
c = split($0, logarr, /[ \[\]"]/)
split(logarr[5], datearr, /[/:]/)
ts = mktime(datearr[3] " " months[datearr[2]] " " datearr[1] " " datearr[4] " " datearr[5] " " datearr[6])
logarr[5] = strftime("%F %T", ts)
for (f = 1; f <= fieldcount; f++) {
printf "%s%s", delim, logarr[fields[f]]
delim = OFS
}
printf "\n"
}
根据您的示例日志条目,输出如下:
50.47.142.25,-,-,2012-05-07 00:00:14,0,GET,/mywebpage/blah.jsp,200,123,-,-
引号和方括号被丢弃,因为它们与空格一起用作场分割器。此外,这会创建大量的假字段,因此我使用字段列表进行迭代。
请注意,mktime()
和strftime()
函数特定于GNU AWK(gawk
)。