基于前2列合并2个列表

时间:2015-01-07 12:58:20

标签: join awk

我需要根据第1列和第2列合并2个列表

文件1:

client1,server1,3000.00
client1,server2,2500.00
client1,server3,1500.00
client2,server1,4500.00
client2,server2,2300.00
client2,server3,1230.00
client3,server1,3400.00
client3,server2,4500.00
client3,server3,1245.00
client4,server1,3400.00
client5,server2,4500.00
client6,server3,1245.00
client7,server1,3400.00
client7,server2,4500.00
client8,server3,1245.00
client8,server1,3400.00
client8,server2,4500.00
client9,server3,1245.00

file2的:

client1,server1,windows,250g
client1,server2,linux,450g
client1,server3,linux,400g
client2,server1,windows,250g
client2,server2,linux,450g
client2,server3,linux,400g
client3,server1,windows,250g
client3,server2,linux,450g
client3,server3,linux,400g

我需要的是使用第1列中的缺失值更新file2,仅更新file1并添加逗号以保持相同数量的列

使用此示例,输出应如下所示:

client1,server1,windows,250g
client1,server2,linux,450g
client1,server3,linux,400g
client2,server1,windows,250g
client2,server2,linux,450g
client2,server3,linux,400g
client3,server1,windows,250g
client3,server2,linux,450g
client3,server3,linux,400g
client4,server1,,
client5,server2,,
client6,server3,,
client7,server1,,
client7,server2,,
client8,server3,,
client8,server1,,
client8,server2,,
client9,server3,,

我尝试过使用awk并加入但是我无法获得相同的结果

如果创建新文件更容易,则没有问题

感谢您的帮助

3 个答案:

答案 0 :(得分:1)

试试这行:

awk -F, '{k=$1 FS $2}NR==FNR{a[k]++;print;next}!a[k]{print k",,"}' file2 file1

答案 1 :(得分:1)

另一种方式

awk -F, -vOFS="," 'NR!=FNR{NF--;NF+=2}!a[$1 FS $2]++' test2 test

awk -F, 'NR!=FNR{$0=$1 FS $2",,"}!a[$1 FS $2]++' test2 test

最短

awk -F, '{x=$1","$2}NR!=FNR{$0=x",,"}!a[x]++' test2 test

答案 2 :(得分:0)

使用join命令。问题是join无法加入多个字段,因此我们需要暂时操作第一个逗号:

join -t , -o 0,2.2,2.3 -a 1 <(sed 's/,/:/' file1) <(sed 's/,/:/' file2) | sed 's/:/,/'