我有多行日志文件,我想将其转换为单行日志。
多行示例:
6/13/2015 12:00:47 AM - { 562} START Web
6/13/2015 12:00:47 AM - Requested Web connection from 123.125.71.103 [123.125.71.103], ID=562
6/13/2015 12:01:24 AM - { 563} START POP3
6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=563
6/13/2015 12:01:24 AM - ( 563) USER test.mail@test.me
6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=563
6/13/2015 12:01:24 AM - { 563} END POP3
6/13/2015 12:01:24 AM - { 564} START POP3
6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=564
6/13/2015 12:01:24 AM - ( 564) USER test.mail@test.me
6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=564
6/13/2015 12:01:24 AM - { 564} END POP3
6/13/2015 12:01:40 AM - Web connection with 123.125.71.103 [123.125.71.103] ended. ID=562
6/13/2015 12:01:40 AM - { 562} END Web
首先,我希望单行输出,我匹配相同的日志ID(例如“562”)。
6/13/2015 12:00:47 AM - { 562} START Web 6/13/2015 12:00:47 AM - Requested Web connection from 123.125.71.103 [123.125.71.103], ID=562 6/13/2015 12:01:40 AM - Web connection with 123.125.71.103 [123.125.71.103] ended. ID=562 6/13/2015 12:01:40 AM - { 562} END Web
6/13/2015 12:01:24 AM - { 563} START POP3 6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=563 6/13/2015 12:01:24 AM - ( 563) USER test.mail@test.me 6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=563 6/13/2015 12:01:24 AM - { 563} END POP3
6/13/2015 12:01:24 AM - { 564} START POP3 6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=564 6/13/2015 12:01:24 AM - ( 564) USER test.mail@test.me 6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=564 6/13/2015 12:01:24 AM - { 564} END POP3
我已经完成了以下bash脚本的操作,因为它将所有“POP3”或“Web”消息合并到一行而不是根据消息ID将它们分开。
脚本:
#!/bin/bash
HOME=/var/tmp/test.txt
ID=`((awk '$6 ~/[0-9]\W/ {print $6}' $HOME | awk '{gsub (/)/, ""); print}' | awk '{gsub (/}/, ""); print}') && (awk '$11 ~/[0-9]/ {print $11}' $HOME | awk '{gsub ("ID=", ""); print}'))`
for ID in $HOME
do
awk '!/Web/' $HOME | xargs >> final.txt
awk '/Web/' $HOME | xargs >> final.txt
done
有什么建议我应该如何创建循环才能合并相同的ID?
答案 0 :(得分:1)
您可以使用Awk脚本执行此操作:
#!/usr/bin/env awk -f
{
if($5 ~ /[{(]/) {
split($6, b, /[)}]/)
id = b[1]
} else {
split($NF, b, "=")
id = b[2]
}
a[id] = a[id] FS $0
}
END
{
for(id in a)
print a[id]
}
运行如:
$ awk -f script.awk logfile
6/13/2015 12:00:47 AM - { 562} START Web 6/13/2015 12:00:47 AM - Requested Web connection from 123.125.71.103 [123.125.71.103], ID=562 6/13/2015 12:01:40 AM - Web connection with 123.125.71.103 [123.125.71.103] ended. ID=562 6/13/2015 12:01:40 AM - { 562} END Web
6/13/2015 12:01:24 AM - { 563} START POP3 6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=563 6/13/2015 12:01:24 AM - ( 563) USER test.mail@test.me 6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=563 6/13/2015 12:01:24 AM - { 563} END POP3
6/13/2015 12:01:24 AM - { 564} START POP3 6/13/2015 12:01:24 AM - Requested POP3 connection from 10.127.251.37 [10.127.251.37], ID=564 6/13/2015 12:01:24 AM - ( 564) USER test.mail@test.me 6/13/2015 12:01:24 AM - POP3 connection with 10.127.251.37 [10.127.251.37] ended. ID=564 6/13/2015 12:01:24 AM - { 564} END POP3
脚本会检查第5个字段是否为{
或(
个字符,并相应地拆分第6个或最后一个字段以获取正确的id
。然后,id用作数组a
中的键,以将行($0
)附加到其对应的值。然后在处理完每一行后打印数组的所有元素。