我有log file的Apache Web服务器。我需要按使用频率显示2006年10月1日在终端中排名前10位的主机。我的代码如下:
cat log.txt | grep 01/Oct/2006 | cut -d' ' -f1 | sort | uniq -c | sort -rn | head -10
其输出如下:
6 k141cluster2.fsv.cvut.cz
4 cm-84.209.247.208.chello.no
4 bl1sch2043806.phx.gbl
4 207.188.28.33
3 ppp196-169.adsl.forthnet.gr
3 c-67-169-64-181.hsd1.ca.comcast.net
3 222.231.42.14
2 tang-six-o-five.mit.edu
2 slim07.kataweb.it
2 s010600055ddf8597.ed.shawcable.net
但是我希望它显示为:
k141cluster2.fsv.cvut.cz 6
cm-84.209.247.208.chello.no 4
bl1sch2043806.phx.gbl 4
207.188.28.33 4
ppp196-169.adsl.forthnet.gr 3
c-67-169-64-181.hsd1.ca.comcast.net 3
222.231.42.14 3
tang-six-o-five.mit.edu 2
slim07.kataweb.it 2
s010600055ddf8597.ed.shawcable.net 2
如何使用诸如cut
,paste
,head
,tail
,cat
,tac
,{{ 1}},wc
,join
,grep
,sort
,sed
我想将它们彼此替换。但是我不知道有什么办法。
答案 0 :(得分:0)
这是一个没有awk的解决方案(如评论中所述)。
#!/bin/bash
# Number of visits
grep '01/Oct/2006' log.txt | cut -d' ' -f1 |
sort | uniq -c | sort -rn | head -10 |
# Remove leading spaces
sed 's/^ *//g' |
# Grep for numbers followed by a space
grep -o "[0-9]* " > visits
# Hosts
grep '01/Oct/2006' log.txt | cut -d' ' -f1 |
sort | uniq -c | sort -rn | head -10 |
sed 's/^ *//g' | cut -d ' ' -f2 > hosts
# For the frequency use wc -l to count the total number of hosts (10000).
# 0,06% x 10000 = 6 visits
while read nbr; do
host=$(wc -l < log.txt)
# scale=2, two decimals
echo "scale=2; ($nbr*100/$host)" | bc
done < visits >> percent
paste hosts percent > hosts_percent
# or For number of visits
# paste hosts visits
答案 1 :(得分:0)
您可以使用sed
来修改管道的输出:
$ cat log.txt | grep 01/Oct/2006 | cut -d' ' -f1 | sort | uniq -c | sort -rn |
head -10 | sed -E 's/^([ ][ ]*[[:digit:]][[:digit:]]*[ ][ ]*)(.*$)/\2\1/'
k141cluster2.fsv.cvut.cz 6
cm-84.209.247.208.chello.no 4
bl1sch2043806.phx.gbl 4
207.188.28.33 4
ppp196-169.adsl.forthnet.gr 3
c-67-169-64-181.hsd1.ca.comcast.net 3
222.231.42.14 3
tang-six-o-five.mit.edu 2
slim07.kataweb.it 2
s010600055ddf8597.ed.shawcable.net 2
您还可以编写一个gawk
脚本,该脚本将替换整个管道并允许更轻松的自定义:
$ gawk 'function by_vi(i1,v1,i2,v2) {
v1 = v1+0
v2 = v2+0
if (v1 > v2) return -1
if (v2 > v1) return 1
# vals are same; now sort on idx
return (i2 < i1) ? -1 : (i1 != i2)
}
/01\/Oct\/2006/ {cnt[$1]++}
END{PROCINFO["sorted_in"]="by_vi"
lcnt=1
for (e in cnt) { printf "%s \t%s\n", e, cnt[e]
if(++lcnt>10) break
}
}' log.txt
# same output