BASH - 计算文件中类似行的数量

时间:2011-12-24 21:41:28

标签: bash

我在一个论坛中有一个主题,人们可以在这里写下他们的十大歌曲列表。我想计算一首歌被列出的次数。相似性必须比较不区分大小。

文件结构示例:

Join Date: Apr 2005
Location: bama via new orleans
Age: 48
Posts: 2,369
Re: Top 10 Songs Jethro Tull
oh dearrrr. the only way for all kaths to keep their last shred of sanity: fly through this list as quickly as possible, without stopping to think for a microsecond...
velvet green
dun ringill
skating away on the thin ice of a new day
sossity yer a woman
fat man
life's a long song
jack-a-lynn
teacher
mother goose
elegy

 03-10-2010, 02:29 AM      #5 (permalink)
Sox
Avoiding The Swan Song



Join Date: Jan 2010
Location: Derbyshire, England
Age: 43
Posts: 5,991
 Re: Top 10 Songs Jethro Tull
Wow !!!! Where do I start ?
Dun Ringill
Aqualung
With You There To Help Me
Jack Frost And The Hooded Crow
We Used To Know
Witch's Promise
Pussy Willow
Heavy Horses
My Sunday Feeling
Locomotive Breath

Join Date: Nov 2009
Posts: 1,418
 Re: Top 10 Songs Jethro Tull
Too bad they all can't make the list, but here's ten I never get tired of listening to:

Christmas Song
Witches Promise
Life's A Long Song
Living In The Past
Rainbow Blues
Sweet Dream
Minstral In The Gallery
Cup of Wonder
Rover
Something's On the Move

示例输出:

life's a long song 3
aqualung 1
...

3 个答案:

答案 0 :(得分:13)

你的文件“结构”在结构部门中有点缺乏,所以你必须在这个过程中处理一些错误。

假设您拥有名为input的文件中的所有内容,请尝试:

tr '[A-Z]' '[a-z]' < input | \
     egrep -v "^ *(join date|age|posts|location|re):" | \
     sort | \
     uniq -c

第一行缩小所有内容,第二行删除样本中看起来像电子邮件标题的内容,然后对唯一项进行排序和计数。

答案 1 :(得分:9)

此命令列出行和重复次数

sort nameFile | uniq -c 

答案 2 :(得分:2)

如何使用awk -

awk '
/:/||/^$/{next}{a[toupper($0)]++}
END{for(i in a) print i,a[i]}' INPUT_FILE

说明:

  

首先,我们确定其中包含:blank的行并忽略它们。存储的所有其他行都转换为大写并存储在数组中。   在我们的END statement中,我们打印出数组中的所有内容及其找到的次数。

测试:

awk '
/:/||/^$/{next}{a[toupper($0)]++}
END{for(i in a) print i,a[i]}' file1
SOX 1
CHRISTMAS SONG 1
CUP OF WONDER 1
SOSSITY YER A WOMAN 1
FAT MAN 1
PUSSY WILLOW 1
VELVET GREEN 1
WITH YOU THERE TO HELP ME 1
ELEGY 1
WE USED TO KNOW 1
TEACHER 1
MY SUNDAY FEELING 1
SWEET DREAM 1
JACK-A-LYNN 1
SOMETHING'S ON THE MOVE 1
ROVER 1
DUN RINGILL 2
AVOIDING THE SWAN SONG 1
JACK FROST AND THE HOODED CROW 1
WITCHES PROMISE 1
LIFE'S A LONG SONG 2
LIVING IN THE PAST 1
WITCH'S PROMISE 1
WOW !!!! WHERE DO I START ? 1
SKATING AWAY ON THE THIN ICE OF A NEW DAY 1
MINSTRAL IN THE GALLERY 1
RAINBOW BLUES 1
MOTHER GOOSE 1
HEAVY HORSES 1
AQUALUNG 1
LOCOMOTIVE BREATH 1