这是我要过滤的日志文件,
xxxyyy.com/plugins/status.gif?type=videoprogress;status=first;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoprogress;status=mid;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=US;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoprogress;status=third;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=US;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoprogress;status=complete;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoothers;status=pause;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoothers;status=mute;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1547;cid=IN;cpid=1547
xxxyyy.com/plugins/status.gif?type=videoothers;status=unmute;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoothers;status=error;sid=6941c712-ca83-4aa1-a69a-931ca66df656;vid=606829;vrid=61478182;pid=1546;cid=IN;cpid=1546
我需要这样的输出
pid cid cpid Count
1545 IN 1545 4
1545 US 1545 2
1546 IN 1546 1
1547 IN 1547 1
请有人帮助我
答案 0 :(得分:1)
kent$ awk -F';' '{a[$(NF-2) OFS $(NF-1) OFS $NF]++}
END{for(x in a)print x, a[x]}' file
pid=1547 cid=IN cpid=1547 1
pid=1545 cid=US cpid=1545 2
pid=1546 cid=IN cpid=1546 1
pid=1545 cid=IN cpid=1545 4
现在您可以调整输出以适合您所需的格式。
答案 1 :(得分:0)
与肯特的差别很小:
awk -F';' '{ split($6,pid,"="); split($7,cid,"="); split($8,cpid,"="); n[pid[2] OFS cid[2] OFS cpid[2]]++; } END { print "pid","cid","cpid","count"; for (p in n) { print p,n[p] } }' input.txt
给出:
pid cid cpid count
1545 IN 1545 4
1545 US 1545 2
1546 IN 1546 1
1547 IN 1547 1
只是带注释的代码
{
split($6,pid,"="); split($7,cid,"="); split($8,cpid,"="); # Get the numbers from each pair in an array
n[pid[2] OFS cid[2] OFS cpid[2]]++; # count the tuples from the numbers (create an array with the tuples as key and increment it)
}
END {
print "pid","cid","cpid","count"; # print the header
for (p in n) { print p,n[p] } # print the key (tuples) and the count of it
}
答案 2 :(得分:0)
另一种方式,类似于其他方式
awk -F';' '{for(i=0;i<3;i++){split($(NF-i),a,"=");x=a[2]" "x;NR==1&&y=a[1]" "y}
b[x]++;x=z}END{print y "count";for(i in b)print i b[i]}' file
提取对的值和名称,然后使用值作为键递增数组
打印出提取的标题和新的count
标题。
循环数组打印出键(值)和出现次数
pid cid cpid count
1547 IN 1547 1
1545 IN 1545 4
1546 IN 1546 1
1545 US 1545 2