Question

我需要重新排序此（制表符分隔）数据的列：

   1 cat    plays
   1 dog    eats
   1 horse  runs
   1 red    dog
   1 the    cat
   1 the    cat

这样就是打印：

cat plays   1
dog eats    1
horse   runs    1
red dog 1
the cat 2

我试过了：

sort [input] | uniq -c | awk '{print $2 "\t" $3 "\t" $1}' > [output]

结果是：

1   cat 1
1   dog 1
1   horse   1
1   red 1
2   the 1

有人能给我一些关于出了什么问题的见解吗？谢谢。

Answer 1

由于cat input | sort | uniq -c的输出是：

   1    1 cat    plays
   1    1 dog    eats
   1    1 horse  runs
   1    1 red    dog
   2    1 the    cat

你需要这样的东西：

cat input | sort | uniq -c | awk '{print $3 "\t" $4 "\t" $1}'

Answer 2

uniq -c添加了一个额外的列。这应该为您提供所需的输出：

$ sort file | uniq -c | awk '{print $3 "\t" $4 "\t" $1}'
cat     plays   1
dog     eats    1
horse   runs    1
red     dog     1
the     cat     2

Answer 3

awk和sort：

$ awk '{a[$2 OFS $3]++}END{for(k in a)print k,a[k]}' OFS='\t' file | sort -nk3 
cat     plays   1
dog     eats    1
horse   runs    1
red     dog     1
the     cat     2

Answer 4

如果您有GNU awk（gawk），则只能使用它及其功能asorti()：

#!/usr/bin/env gawk -f
{
    a[$2 "\t" $3]++
}
END {
    asorti(a, b)
    for (i = 1; i in b; ++i) print b[i] "\t" a[b[i]]
}

一行：

gawk '{++a[$2"\t"$3]}END{asorti(a,b);for(i=1;i in b;++i)print b[i]"\t"a[b[i]]}' file

输出：

cat plays   1
dog eats    1
horse   runs    1
red dog 1
the cat 2

更新：保留原始订单而不分类使用：

#!/usr/bin/awk -f
!a[$2 "\t" $3]++ {
    b[++i] = $2 "\t" $3
}
END {
    for (j = 1; j <= i; ++j) print b[j] "\t" a[b[j]]
}

或者

awk '!a[$2"\t"$3]++{b[++i]=$2"\t"$3}END{for(j=1;j<=i;++j)print b[j]"\t"a[b[j]]}' file

这次任何awk版本都会兼容。

此次输出应该相同，因为输入已经默认排序。

使用AWK重新排序列

4 个答案: