如何使bash脚本从关键字创建一个日志文件表单

时间:2017-03-10 04:17:14

标签: linux bash csv logging

我尝试使用bash脚本从一个复杂的文件制作1个日志文件CSV,我试过但只是从日志文件中找到了关键字,请帮帮我。

示例复杂的日志文件(10k行):

"$date1" "url=$a1&http=$a2&ip=$a3&from=$a4"

"$date2" "url=$b1&http=$b2&from=$a4&sip=$b5"

"$date3" "url=$c1&http=$c2&ip=$c3&UID=$c6&K-Id=c8"

"$date4" "http=$d2&ip=$d3&from=$d4&utm_id=$d7"

我找到了关键词并将其设为这样的文件:

url
http
ip
from
sip
UID
utm_id

我必须找到如何使用这样的文件形成csv的bash脚本:

DATE    URL   HTTP   IP   FROM   SIP   UID   utm_ID     K_id

$date1  a1     a2    a3   a4

$date2  b1     b2         b4      b5

$date3  c1     c2    c3                c6                 c8

$date4  d1     d2    d3   d4                  d7

请帮帮我。

2 个答案:

答案 0 :(得分:1)

这是一个用gawk编写的可行示例,使用您问题中的数据进行测试。

<强> log.awk

/.*=.*/ { # ignore all lines without url parameters
for (i=5;i<NF;i+=2) 
    d[substr($2,0,10)][$i]++
    # if your date format is 2017-02-09T06:15:24.349847Z, change to
    # d[$2][$i]++
}

END {
for (i in d) {
    for (j in d[i]) {
        t[j]++ # find all paramters
    }
}

# print header
printf "DATE"
for (p in t) {
    printf "\t\t%s",toupper(p)
}
printf "\n"
for (i in d) {
    printf "%s",i
    for (p in t) {
        if (p in d[i]) {
            printf "\t\t%s",d[i][p]
        } else {
            printf "\t\t"
        }
    }
    printf "\n"
}
}

将上面的内容保存为文件log.awk,然后在您的bash shell中,以

运行
$ gawk -F '["&=?]' -f log.awk little-output.log
DATE    HTTP    FROM    UTM_ID  URL K-ID    UID IP  SIP
$date1  1   1       1           1   
$date2  1   1       1               1
$date3  1           1   1   1   1   
$date4  1   1   1               1   

这里的粘贴结果没有很好地格式化,但是在shell输出中结果很好,或者你可以将输出重定向到文件。

答案 1 :(得分:0)

这是让你入门的东西。您可以像以下一样运行它:

./script_below some_log_file.log

方法基本上是:

for each line:
    initialize a new empty key-value map
    save the date into map
    for key/value pairs after date:
        put key value pair into map

    print the contents of the map

以下是Bash中的实现:

#!/bin/bash

set -e

readonly input_file="$1"

format="%s"
for i in {0..8}; do
    format="%7s$format"
done
format="$format\n"

known_keys=("date" "url" "http" "ip" "from" "sip" "UID" "utm_id" "K-Id")
printf "$format" ${known_keys[@]}

while read line; do
    unset attrs
    declare -A attrs

    vals=(${line//\"/})
    attrs['date']=${vals[0]}

    sub_vals=(${vals[1]//[=&]/ })

    set -- ${sub_vals[@]}
    while [ $# -gt 0 ]; do
        attrs["$1"]="${2/$/}"
        shift
        shift
    done

    printf "$format" \
        "${attrs['date']}" "${attrs['url']}" "${attrs['http']}" "${attrs['ip']}" \
        "${attrs['from']}" "${attrs['sip']}" "${attrs['UID']}" "${attrs['utm_id']}" "${attrs['K-Id']}"


done < "$input_file"

打印:

   date    url   http     ip   from    sip    UID utm_id   K-Id
 $date1     a1     a2     a3     a4                            

 $date2     b1     b2            a4     b5                     

 $date3     c1     c2     c3                   c6            c8

 $date4            d2     d3     d4                   d7       

哦,最后的注意事项:虽然我已经说明了确实可以在Bash中完成,但我会建议使用一种完整的,正确的编程语言。