如何基于公共字段合并一组列

时间:2020-04-29 00:56:04

标签: bash unix awk sed merge

如何在bash中执行以下操作。

a)取出第1,2,6列

b)行由字段“ packetId”标识;可以有一排或两排具有相同的“ packetId”;如果有2行具有相同的packetId,则将第一行附加到第二行的最后一个字段

c)如果“ packetId”只有一行,则忽略该行并不打印

输入

SequenceId,TimeStamp,packetId,size,secondaryid,eventType,randomfield,Source,Destination,SystemTime 
1,3:41:24,1,100,xyz,event1,abc,S1,D1,1586989874

2,3:41:25,1,100,xyz,event2,abc,S1,D1,1586989877

3,3:41:26,2,100,xyz,event1,abc,S1,D1,1586989879

4,3:41:26,3,100,xyz,event1,abc,S1,D1,1586989871

5,3:41:26,3,100,xyz,event2,abc,S1,D1,1586989879

输出

packetId,size,secondaryid,randomfield,Source,Destination,SystemTime,OtherSystemTime

1,100,xyz,abc,S1,D1,1586989874,1586989877

3,100,xyz,abc,S1,D1,1586989871,1586989879

1 个答案:

答案 0 :(得分:0)

您可以在awk中完成所有操作,但是使用cut首先删除不需要的字段会更简单:

$ cut -d, -f3-5,7- input.csv |
  awk -F, 'NR == 1 { print $0 ",OtherSystemTime"; next }
           { if ($1 in seen) print seen[$1] "," $NF; else seen[$1] = $0 }'
packetId,size,secondaryid,randomfield,Source,Destination,SystemTime,OtherSystemTime
1,100,xyz,abc,S1,D1,1586989874,1586989877
3,100,xyz,abc,S1,D1,1586989871,1586989879