我想在“ctr {words}”之间打印单词,并在文件中计算相同的单词。
我试过了:
sed -n 's/.*ctr{\(.[^}]*\).*/\1/p' file
但它没有搜索所有单词,只搜索一个单词
文件是:
962796057604|mar0101|0|00000107A20E00000A6C331650B920340C00|0|0|400019FD7DBFBF7F|1001|962796057604|0 |01001|||-1|795971936| 00962795971936|16||-1| 00962795971936|-1|0|2|0|416019000659493|0||||||0|0|2012.12.01 00:07:09|12|30|0|516|16|1|2012.12.01 00:06:39|1|0||202|20001||0B12F1001104697209100300000000000000|1|1|11000|0|0||0881006972091003F000||0 714F610045584E6|000000000000|3|1|0000000000000000|0|140|0|0|0|0|0|0|||0|2|||||||||||||||||||||0|||0| |0|1|143|acf{0}cif{0}fcf{0}con{0}cuf{0}ctr{**Mo7afazat**}cgpa{962796057604}vlr{0096279001300}cff{0}roaf{0}mpty{0}ftksn{JMT}ftksr{0001}ftktp{CallTicketCPOCS} ||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962796298894|mar0101|0|000001028225AE4AD868A8B750B900980C00|1|0|4000018001000002||962796298894|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||3797|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|244|tid{111210532409329884}pfid{20}gob{1}rid{globitel} afid{}uid1{962796298894}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{**JaishanaIN**}ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|100-50-0-962796605155|mar0101|0|00000102A20400000A6A439D50B920520C00|0|0|400019FD7DBFBF7F|1001|962796605155|1 6||||-1|b116c||16||-1||-1|0|0|0|416017002233360|0||||||0|0|1970.01.01 02:00:00|0|0|0|220|0|1|1970.01.01 02:00:00|1|0||194|0||000000000000000000000000000000000000|0|0||0|0||00000000000000000000||0000000000 000000|000000000000|0|0|0000000000000000|0|370|0|0|0|0|0|0|||0|0|||||||||||||||||||||0|||0||0|1|70|a cf{3}ussd{1}ctr{**ZainElKul**}ftksn{JMT}ftksr{0001}ftktp{CallTicketCPOCS}||
1|34|2012.12.01 00:08:35|12|4|100-10-0
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962797611253|mar0101|0|0000010282B54BD015FF4C4B50B8F96E0C00|1|0|4000018001000002||962797611253|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||885|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|243|tid{111220371293561120}pfid{20}gob{1}rid{globitel} afid{}uid1{962797611253}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{**ZainElKul**}ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
-962795292027|mar0101|0|00000101A20200000A6A96B750B920300C00|0|0|400019FD7DBFBF7F|1001|962795292027|0 |01004|||-1|797196452| 00962797196452|16||-1| 00962797196452|-1|0|2|0|416018002276781|0||||||0|0|2012.12.01 00:07:09|12|12|23|516|16|1|2012.12.01 00:06:34|1|0||202|1||0B12F1001104697209100300000000000000|1|1|11000|0|0||0881006972091003F000||0714F 6100455AD67|000000000000|3|1|0000000000000000|0|30|0|0|0|0|0|0|||0|0|||||||||||||||||||||0|||0||0|1| 171|acf{0}cif{0}fcf{0}con{0}cuf{0}ctr{ZainUnlimited}cgpa{962795292027}vlr{0096279001300}cff{0}roaf{0}mpty{0}cacc{1;0;30}cquo{1;230;}ftksn{JMT}ftksr{000 1}ftktp{CallTicketCPOCS}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962796012818|mar0101|0|0000010882218115085D5F9150B920520C00|0|0|4000018001000002||962796012818|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|1|||||||||||||0|0|||70|0|0|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|258|tid{111221366974701289}pfid{17}gob{1}rid{globitel} afid{}uid1{962796012818}aid1{1}ar1{-2147483648}uid2{}aid2{-1}pid{DEFAULT_DECISION}pur{!GDRC Balance Check}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{**AlBarakehNew**}ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-962797251349|mar0101|0|0000010282A451483EDFCFD350B920400C00|1|0|4000018001000002||962797251349|||||-1|||||-1||-1|0||||0||||||-1|-1|||-1|0|-1|-1|-1|2012.12.01 00:08:35|1|0||-1|0|||||||||||||0|0|||440|0|12|-2147483648|-2147483648|-2147483648|-2147483648|||||||||||||||||||||||||0|||0||1|6|245|tid{111211342745325133}pfid{20}gob{1}rid{globitel} afid{}uid1{962797251349}aid1{1}ar1{0}uid2{globitel}aid2{-1}pid{1234}pur{!GDRC COMMIT AMOUNT 0}ratinf{}rec{0}rots{0}tda{}mid{}exd{0}reqa{0}ctr{**ZainElKulSN**}ftksn{JMT}ftksr{0001}ftktp{PayCallTicket}||
1|34|2012.12.01 00:08:35|12|4|921-*203-0000000000-
答案 0 :(得分:1)
看起来你错过了计数。最简单的方法是通过uniq -c
管道输出:
$ sed -n 's/.*ctr{\(.[^}]*\).*/\1/p' file | sort | uniq -c
1 **Mo7afazat**
1 **JaishanaIN**
2 **ZainElKul**
1 ZainUnlimited
1 **AlBarakehNew**
1 **ZainElKulSN**
另一种方式,仅使用awk
:
$ awk 'match($0,".*ctr{([^}]*)}.*",m){a[m[1]]++}END{for(i in a) print i,a[i]}' file
ZainUnlimited 1
**ZainElKulSN** 1
**Mo7afazat** 1
**ZainElKul** 2
**JaishanaIN** 1
**AlBarakehNew** 1
答案 1 :(得分:0)
在文件grep
中搜索匹配项时,通常是最佳选择。
将grep
与预测前瞻和uniq -c
一起使用:
$ grep -Po "(?<=ctr{)[^}]+" file | uniq -c
1 Mo7afazat
1 JaishanaIN
2 ZainElKul
1 ZainUnlimited
1 AlBarakehNew
1 ZainElKulSN
来自man uniq
:
注意:'uniq'不会检测重复的行,除非它们相邻。
对于重复项与sort
不相邻的文件,首先在orignal文件中找到每个匹配项的顺序将丢失:
grep -Po "(?<=ctr{)[^}]+" file | sort | uniq -c
1 AlBarakehNew
1 JaishanaIN
1 Mo7afazat
2 ZainElKul
1 ZainElKulSN
1 ZainUnlimited