将tomcat访问日志放在SQL Server数据库中

时间:2012-05-15 11:24:16

标签: regex sed awk

我有tomcat访问日志,条目如下:

50.47.142.25 - - [07/May/2012:00:00:14 +0000] 0 "GET /mywebpage/blah.jsp " 200 123 "-" "-"

我希望将所有条目放在SQL表中,然后对其运行SQL查询。

我正在考虑使用GAWK(gnu AWK)来获取CSV格式的所有行。类似的东西:

gawk '{print $1 ", " $2 ", " , " $3 ", " $4 ", " $5 ", " $6 ", " $7 ", " $8 ", " $9 ", " $9}'

给了我

50.47.142.25, -, -, [11/May/2012:08:51:02, 0, "GET /mywebpage/blah.jsp" 200, 123, -, -

让我接近SQL插入语句。除此之外,我需要使用以下格式的日期时间:

2012-05-11 08:51:02

即没有前导方括号和SQLServer希望它的格式。任何提示?

1 个答案:

答案 0 :(得分:3)

#!/usr/bin/awk -f
BEGIN {
    monthlist = "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"
    c = split(monthlist, monthsarr)
    for (i = 1; i <= c; i++) {
        months[monthsarr[i]] = i
    }
    fieldlist = "1 2 3 5 8 10 11 14 15 17 20"
    fieldcount = split(fieldlist, fields)
    OFS = ","
}

{
    delim = ""
    c = split($0, logarr, /[ \[\]"]/)
    split(logarr[5], datearr, /[/:]/)
    ts = mktime(datearr[3] " " months[datearr[2]] " " datearr[1] " " datearr[4] " " datearr[5] " " datearr[6])
    logarr[5] = strftime("%F %T", ts)
    for (f = 1; f <= fieldcount; f++) {
        printf "%s%s", delim, logarr[fields[f]]
        delim = OFS
    }
    printf "\n"
}

根据您的示例日志条目,输出如下:

50.47.142.25,-,-,2012-05-07 00:00:14,0,GET,/mywebpage/blah.jsp,200,123,-,-

引号和方括号被丢弃,因为它们与空格一起用作场分割器。此外,这会创建大量的假字段,因此我使用字段列表进行迭代。

请注意,mktime()strftime()函数特定于GNU AWK(gawk)。