我有这样的访问日志,我想抓住每个人和每个人,然后按顺序找到哪一个。
173.192.238.41 - - [28/Feb/2013:07:06:09 -0500] "GET / HTTP/1.1" 200 20644 "-" "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); http://spinn3r.com/robot) Gecko/2010040121 Firefox/3.0.19"
208.115.113.84 - - [28/Feb/2013:07:06:19 -0500] "GET /tag/bright HTTP/1.1" 404 327 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
94.228.34.214 - - [28/Feb/2013:07:10:16 -0500] "GET /alli-comes-home-12-10-09-day-224-2264/feed HTTP/1.1" 404 359 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
209.171.42.71 - - [28/Feb/2013:07:11:19 -0500] "GET /feed/atom HTTP/1.1" 404 326 "-" "Mozilla/5.0 (compatible; BlogScope/1.0; +http://www.blogscope.net/; U of Toronto)"
94.228.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
我可以像这样对它进行排序吗?
cat /path/to/access.log | awk '{print $1}' | sort | uniq -c
答案 0 :(得分:4)
你很亲密。计算它们之后,你必须按计数排序:
awk '{print $1}' /path/to/access.log | sort | uniq -c | sort -n
您也可以使用awk进行计数,而不是使用sort
和uniq
:
awk '{count[$1]++} END {for (ip in count) print count[ip], ip;}' | sort -n
答案 1 :(得分:0)
awk '{a[$1]++}END{for(i in a)print a[i],i}' your_log|sort -rn
或
perl -lane '$x{$F[0]}++;END{for(keys %x){print $x{$_}." ".$_;}}' your_log|sort -rn
答案 2 :(得分:0)
以下是您可以按出现次数依次按地址排序IPv4地址的一种方式:
# cut takes only the first column from access.log
<access.log cut -d' ' -f1 |
# Presort the IP addresses so uniq can count them
sort |
uniq -c |
# Format the stream so it only contains `.' delimiters
sed 's/^ *//; s/ /./' |
# Now sort numerically based on each consecutive dot delimited column
sort -t. -k1,1n -k2,2n -k3,3n -k4,4n -k5,5n |
# Reset the first delimter
sed 's/\./ /'
测试输入:
cat << EOF > access.log
173.192.238.41 - - [28/Feb/2013:07:06:09 -0500] "GET / HTTP/1.1" 200 20644 "-" "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); http://spinn3r.com/robot) Gecko/2010040121 Firefox/3.0.19"
208.115.113.84 - - [28/Feb/2013:07:06:19 -0500] "GET /tag/bright HTTP/1.1" 404 327 "-" "Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)"
94.228.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
94.228.34.214 - - [28/Feb/2013:07:10:16 -0500] "GET /alli-comes-home-12-10-09-day-224-2264/feed HTTP/1.1" 404 359 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
209.171.42.71 - - [28/Feb/2013:07:11:19 -0500] "GET /feed/atom HTTP/1.1" 404 326 "-" "Mozilla/5.0 (compatible; BlogScope/1.0; +http://www.blogscope.net/; U of Toronto)"
209.71.42.71 - - [28/Feb/2013:07:11:19 -0500] "GET /feed/atom HTTP/1.1" 404 326 "-" "Mozilla/5.0 (compatible; BlogScope/1.0; +http://www.blogscope.net/; U of Toronto)"
94.228.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
94.229.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
94.227.34.229 - - [28/Feb/2013:07:12:48 -0500] "GET /the-latest-design-franck-muller-watches-and-versace-watches-6838/feed HTTP/1.1" 404 386 "-" "magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
EOF
输出:
1 94.227.34.229
1 94.228.34.214
1 94.229.34.229
1 173.192.238.41
1 208.115.113.84
1 209.71.42.71
1 209.171.42.71
2 94.228.34.229