awk |在场匹配的基础上合并线

时间:2013-02-13 05:51:59

标签: sed awk

我需要以下帮助:

输入文件:

abc message=sent session:111,x,y,z
pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z
pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z
abc message=sent session:589,x,y,z
pqr message=receive session:589,4,5,7

输出文件:

abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7

注意:

如果您在源文件中看到,对于每个“已发送”消息,都有“接收”
仅对于session = 342,没有收到
会议是未知的,不能硬编码 因此,只合并那些我们有匹配会话号的发送和接收

2 个答案:

答案 0 :(得分:1)

这是使用awk的一种方式。像:

一样运行
awk -f script.awk file

script.awk的内容:

{
    x = $0

    gsub(/[^:]*:|,.*/,"")

    a[$0] = (a[$0] ? a[$0] "," FS : "") x
    b[$0]++
}

END {
    for (i in a) {
        print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort"
    }
}

结果:

abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7

或者,这是单行:

awk '{ x = $0; gsub(/[^:]*:|,.*/,""); a[$0] = (a[$0] ? a[$0] "," FS : "") x; b[$0]++ } END { for (i in a) print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort" }' file

请注意,如果您不关心已排序的输出,则可以将管道拖放到sort。 HTH。

答案 1 :(得分:1)

另一种方式:

awk -F "[:,]"  '/=sent/{a[$2]=$0;}/=receive/{print a[$2], $0;delete a[$2];}END{for(i in a)print a[i],"NO MATCH";}' file

结果:

abc message=sent session:111,x,y,z pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z pqr message=receive session:123,4,5,7
abc message=sent session:589,x,y,z pqr message=receive session:589,4,5,7
abc message=sent session:342,x,y,z NO MATCH

遇到send记录时,它会以会话ID作为索引存储在数组中。遇到receive记录时,将从阵列中提取send记录,并与receive记录一起打印。此外,当收到receive记录时,已从阵列中删除已发送的记录。在END处,数组中的所有剩余记录都打印为NO MATCH。