按照发生的频率订购文本文件

时间:2013-12-10 16:56:14

标签: linux bash sed awk

给定一个文本文件,其中包含重复的行,例如:

this is a line
this is a line
this is another line
this is a line
this is yet another line
this is yet another line

是否可以在命令行上打印出每个唯一的行,但按其出现的频率排序。

即。上一个文本的结果将是:

this is a line
this is yet another line
this is another line

它们分别出现3次,2次和1次。

2 个答案:

答案 0 :(得分:4)

试试这个:

sort file|uniq -c|sort -rn

编辑: 此外,如果你想在行的开头删除计数器只是管道 sed 's/^\s*[0-9]* \(.*\)$/\1/' 在上面命令的末尾。

答案 1 :(得分:1)

你可以这样做:

awk '{ a[$0]++ } END {for (i in a) print a[i], i }' | sort -nr
3 this is a line
2 this is yet another line
1 this is another line