从uniq命令中删除文本中的计数数字Bash Linux

时间:2015-12-10 19:01:32

标签: linux bash

我正在尝试从50个常见元素开始保存文件中所有重复行的清晰列表...并且我使用此特定命令来执行此操作

sort file | uniq -cd | awk '$1>50' | sort -nr > output

现在我有一个这样的列表,按大元素

排序
  12960 <Groan>
   5760 <Snore>
   3985 Talk to <<1>>
   2880 <Nightmare mumble>
   1976 ACCEPT
   1935 Examine
   1744 Yes?
   1733 Hm?
   1701 <<1>>
   1587 What is it?
   1578 What do you want?
   1563 What?
   1514 Well?
   1427 glyph^n
   1189 Examining…
   1019 Now what?
   1010 You again?
   1009 What do you want now?
   1008 <sigh> Again?
    827 Fit only to use for research, or to sell for scrap.
    804 Sack
    792 Back again?
    691 Take
    690 Food
    688 Opening…
    605 Search
    596 Book
    574 Urn
    [...]

但我想要的是这样的列表,在我的文件中没有“计数”,这样我可以更自由地处理文件......

<Groan>
<Snore>
Talk to <<1>>
<Nightmare mumble>
ACCEPT
Examine
Yes?
Hm?
<<1>>
What is it?
What do you want?
What?
Well?
glyph^n
Examining…
Now what?
You again?
What do you want now?
<sigh> Again?
Fit only to use for research, or to sell for scrap.
Sack
Back again?
Take
Food
Opening…
Search
Book
Urn
[...]

1 个答案:

答案 0 :(得分:0)

使用GNU grep:

sort file | uniq -cd | awk '$1>50' | sort -nr | grep -oP '^ *[0-9]+ \K.*'