Shell分组文件重复模式

时间:2017-06-02 12:05:57

标签: shell sorting awk scripting

假设我有一个文件

a,anything,keyboard
b,anything,mouse
c,anything,door
a,anything,monitor
d,anything,keyboard

结果我想要

a,anything,keyboard - monitor
b,anything,mouse
c,anything,door
d,anything,keyboard

模式“a”重复,我想在结果中合并“键盘”和“监视器”。

我的问题是如何将在每行开头重复的模式(在本例中为“a”)合并为一行添加什么是不同的(在此示例中,添加单词“monitor”

cat file.csv | cut -d',' - f1 |排序-u 结果:

a
b
c
d

我想要结果:

a,anything,keyboard - monitor
b,anything,mouse
c,anything,door
d,anything,keyboard

1 个答案:

答案 0 :(得分:1)

我称之为分组而非排序

gawk (GNU awk)解决方案:

awk -F, 'BEGIN{ PROCINFO["sorted_in"]="@val_str_asc" }{ a[$1]=($1 in a)? a[$1]" - "$3 : $0 }
         END{ asort(a); for(i in a) print a[i] }' file

输出:

a,anything,keyboard - monitor
b,anything,mouse
c,anything,door
d,anything,keyboard