如果URL字符串中的第一个单词匹配,我想求和。例如我想要的输出应该包含sum和url中的第一个单词
Count Response Url
3 400 data.internal.example.com
18 400 homeloans.internal.example.com
4 400 login.internal.example.com
465 400 login.internal.example.com
3 400 regions.internal.example.com
5 400 search.example.com
6 400 search.example.com
30 400 search.example.com
2 400 search.example.com
1 400 search.internal.example.com
1 422 login.example.com
1 422 login.example.com
139 422 newprojects.internal.example.com
1 422 notification.example.com
1 500 example.com
1 500 search.example.com
已使用ruby代码和shell命令从日志文件中获取上述内容
result = `ruby -lane 'puts $F.values_at(9,8).join( \"\ \" )' #{@logfile} | grep -E '500\|502\|504\|400\|422\|409\|405'| grep -v "200" |grep -v "Nagar" | grep -v "Colony" |grep -v "Phase" | grep -v "Sector" | grep -v "Road" | grep -v "ignore_protected" |grep -v "LYF_LS_4002" | grep -v "utm_dynamicid" |sort |uniq -c`
下面应该是输出-
Count Response Url
3 400 data
18 400 homeloans
469 400 login
3 400 regions
44 400 search
2 422 login
139 422 newprojects
1 422 notification
1 500 example.com
1 500 search.example.com
答案 0 :(得分:0)
这是awk中的一个:
$ awk '
NR==1 {
print # print header
next
}
{
split($3,t,".") # split the first word
len=length(a[$2 " " t[1]]+=$1) # get the max length of
if(len>max) # counts for pretty print
max=len
}
END {
for(i in a) {
split(a[i],t," ") # separate response and word
printf "%s%" max-length(t[1]) "s %s\n",t[1],t[2],i # output
}
}' file
记录将按照看似随机的顺序输出:
Count Response Url
3 400 regions
3 400 data
1 500 example
1 422 notification
139 422 newprojects
1 500 search
44 400 search
18 400 homeloans
2 422 login
469 400 login
如果要排序(在响应和第一个单词上)输出,请使用GNU awk并将PROCINFO["sorted_in"]="@ind_str_asc"
添加到END{}
块的开头。
答案 1 :(得分:0)
Perl版本,具有排序的输出:
$ perl -lane 'next if $. == 1; # Skip header line
$F[2] =~ s/^[^.]+\K.*//; $recs{$F[1]}{$F[2]} += $F[0];
END { $, = "\t"; print "Count", "Response", "URL";
for $resp (sort keys %recs) {
for $url (sort keys %{$recs{$resp}}) {
print $recs{$resp}{$url}, $resp, $url
}}}' input.txt
Count Response URL
3 400 data
18 400 homeloans
469 400 login
3 400 regions
44 400 search
2 422 login
139 422 newprojects
1 422 notification
1 500 example
1 500 search
还有一个简短而甜美的版本,它使用GNU datamash(此示例假定各列用制表符分隔;如果没有,则将-W
添加到datamash选项中。)
$ cut -d. -f1 input.txt | datamash -Hs groupby 2,3 sum 1
GroupBy(Response) GroupBy(Url) sum(Count)
400 data 3
400 homeloans 18
400 login 469
400 regions 3
400 search 44
422 login 2
422 newprojects 139
422 notification 1
500 example 1
500 search 1
输出列的顺序不同,标题也不同,但是很容易用awk或需要时进行调整。
答案 2 :(得分:0)
我能够使用下面的行-
ruby -lane 'puts $F.values_at(9,8).join( \"\ \" )' \#{@logfile} | grep -E '500\|502\|504\|400\|422\|409\|405'| grep -v "200" |grep -v "Nagar" | grep -v "Colony" |grep -v "Phase" | grep -v "Sector" | grep -v "Road" | grep -v "ignore_protected" |grep -v "LYF_LS_4002" | grep -v "utm_dynamicid" |sort | cut -f1 -d "."| awk '{print $2 " service --- " $1 " response"}'| sort |uniq -c