文件:
22 Hello
22 Hi
1 What
34 Where
21 is
44 How
44 are
44 you
期望输出:
22 HelloHi
1 What
34 Where
21 is
44 Howareyou
如果第一个字段($ 1)中有重复值,则第二个字段应附加文字
如何使用awk实现这一目标?
由于
答案 0 :(得分:10)
$ awk '
!seen[$1]++ { keys[++numKeys] = $1 }
{ str[$1] = str[$1] $2 }
END{
for (keyNr=1; keyNr<=numKeys; keyNr++) {
key = keys[keyNr]
print key, str[key]
}
}
' file
22 HelloHi
1 What
34 Where
21 is
44 Howareyou
答案 1 :(得分:6)
使用awk:
awk '!($1 in a){a[$1]=$2;next} $1 in a{a[$1]=a[$1] $2} END{for (i in a) print i, a[i]}' file
22 HelloHi
44 Howareyou
34 Where
21 is
1 What
编辑:保留订单:
awk '!($1 in a){b[++n]=$1; a[$1]=$2;next} $1 in a{a[$1] = a[$1] $2}
END{for (i=1; i<=n; i++) print b[i], a[b[i]]}' file
22 HelloHi
1 What
34 Where
21 is
44 Howareyou
答案 2 :(得分:5)
要维持订单,您需要跟踪它:
awk '
! seen[$1]++ {order[++n] = $1}
{value[$1] = value[$1] $2}
END {for (i=1; i<=n; i++) print order[i], value[order[i]]}
' <<END
22 Hello
22 Hi
1 What
34 Where
21 is
44 How
44 are
44 you
END
22 HelloHi
1 What
34 Where
21 is
44 Howareyou
如果您知道第1列中的值是连续的(如示例文本中所示),则:
awk '
prev != $1 {printf "%s%s ", sep, $1; sep=RS}
{printf "%s", $2; prev = $1}
END {print ""}
'
其他几种方法:
perl -lane '
push @keys, $F[0] unless grep {$_ eq $F[0]} @keys;
$val{$F[0]} .= $F[1]
} END {
print "$_ $val{$_}" for @keys
' file
并且,进入利基区域
#!/usr/bin/env tclsh
while {[gets stdin line] != -1} {dict append val {*}$line}
dict for {k v} $val {puts "$k $v"}
答案 3 :(得分:1)
这是Python中的替代解决方案,正如@shellter所要求的那样:
from collections import defaultdict
with open("file") as infile:
d = defaultdict(str)
#Build dictionary of values
for line in infile:
line = line.strip()
k, _, v = line.partition(" ")
d[k] += v
#Print everything
for k, v in d.iteritems():
print k,v
请注意,此解决方案中不保留顺序。这是一个替代解决方案,它提供完全所需的输出:
from collections import defaultdict
with open("file") as infile:
d = defaultdict(str)
orig_order = []
#Build dictionary of values
for line in infile:
line = line.strip()
k, _, v = line.partition(" ")
d[k] += v
#Add to original order if not seen yet
if not k in orig_order:
orig_order.append(k)
#Print everything
for k in orig_order:
print k, d[k]
请注意,这些是快速制作的解决方案,我相信可以毫不费力地使它们更短或更灵活。
答案 4 :(得分:0)
如果订单不重要,这将有效:
awk '{a[$1]=a[$1]$2}; END {for (i in a) {print a[i]}}' file
..如果订单 重要:
awk '{if (!a[$1]) b[++i]=$1;a[$1]=a[$1]$2}; END {for (j=1;j<i;j++) {print a[b[j]]}}' file