将多行日志转换为单行

时间:2015-07-26 11:44:10

标签: bash shell awk

我有多行日志文件,我想将其转换为单行日志。

多行示例:

6/13/2015 12:00:47 AM - {   562} START Web 
6/13/2015 12:00:47 AM - Requested Web connection from 123.125.71.103 [123.125.71.103], ID=562 
6/13/2015 12:01:24 AM - {   563} START POP3 
6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=563 
6/13/2015 12:01:24 AM - (   563) USER test.mail@test.me 
6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=563 
6/13/2015 12:01:24 AM - {   563} END POP3
6/13/2015 12:01:24 AM - {   564} START POP3 
6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=564 
6/13/2015 12:01:24 AM - (   564) USER test.mail@test.me 
6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=564 
6/13/2015 12:01:24 AM - {   564} END POP3
6/13/2015 12:01:40 AM - Web connection with 123.125.71.103 [123.125.71.103] ended. ID=562 
6/13/2015 12:01:40 AM - {   562} END Web

首先,我希望单行输出,我匹配相同的日志ID(例如“562”)。

6/13/2015 12:00:47 AM - {   562} START Web 6/13/2015 12:00:47 AM - Requested Web connection from 123.125.71.103 [123.125.71.103], ID=562 6/13/2015 12:01:40 AM - Web connection with 123.125.71.103 [123.125.71.103] ended. ID=562 6/13/2015 12:01:40 AM - {   562} END Web
6/13/2015 12:01:24 AM - {   563} START POP3 6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=563 6/13/2015 12:01:24 AM - (   563) USER test.mail@test.me  6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=563  6/13/2015 12:01:24 AM - {   563} END POP3
6/13/2015 12:01:24 AM - {   564} START POP3 6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=564 6/13/2015 12:01:24 AM - (   564) USER test.mail@test.me  6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=564  6/13/2015 12:01:24 AM - {   564} END POP3

我已经完成了以下bash脚本的操作,因为它将所有“POP3”或“Web”消息合并到一行而不是根据消息ID将它们分开。

脚本:

#!/bin/bash

HOME=/var/tmp/test.txt

ID=`((awk '$6 ~/[0-9]\W/ {print $6}' $HOME | awk '{gsub (/)/, ""); print}' | awk '{gsub (/}/, ""); print}') && (awk '$11 ~/[0-9]/ {print $11}' $HOME | awk '{gsub ("ID=", ""); print}'))`


for ID in $HOME
do
        awk '!/Web/' $HOME | xargs >> final.txt
        awk '/Web/' $HOME | xargs >> final.txt
done

有什么建议我应该如何创建循环才能合并相同的ID?

1 个答案:

答案 0 :(得分:1)

您可以使用Awk脚本执行此操作:

#!/usr/bin/env awk -f
{
    if($5 ~ /[{(]/) {
        split($6, b, /[)}]/)
        id = b[1]
    } else {
        split($NF, b, "=")
        id = b[2]
    }
    a[id] = a[id] FS $0
}
END 
{
    for(id in a)
        print a[id]
}

运行如:

$ awk -f script.awk logfile
 6/13/2015 12:00:47 AM - {   562} START Web  6/13/2015 12:00:47 AM - Requested Web connection from 123.125.71.103 [123.125.71.103], ID=562  6/13/2015 12:01:40 AM - Web connection with 123.125.71.103 [123.125.71.103] ended. ID=562  6/13/2015 12:01:40 AM - {   562} END Web
 6/13/2015 12:01:24 AM - {   563} START POP3  6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=563  6/13/2015 12:01:24 AM - (   563) USER test.mail@test.me  6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=563  6/13/2015 12:01:24 AM - {   563} END POP3
 6/13/2015 12:01:24 AM - {   564} START POP3  6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=564  6/13/2015 12:01:24 AM - (   564) USER test.mail@test.me  6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=564  6/13/2015 12:01:24 AM - {   564} END POP3

脚本会检查第5个字段是否为{(个字符,并相应地拆分第6个或最后一个字段以获取正确的id。然后,id用作数组a中的键,以将行($0)附加到其对应的值。然后在处理完每一行后打印数组的所有元素。