我正在尝试格式化每几个小时生成的日志。下面是示例和我尝试过的代码。请帮助我获取所需的格式。
[28/Jul/2006:10:27:10 -0500] GET /cgi-bin/try/ HTTP/1.0 200 iphone-S
[28/Jul/2006:10:27:10 -0200] GET /hidden/ HTTP/1.0 404 iphone-X
[28/Jul/2006:10:27:10 -0100] PUT /users/98761/geo/ HTTP/1.0 504 iphone-6s
[28/Jul/2006:10:27:10 -0400] POST /users/12345/places/ HTTP/1.0 202 iphone-7P
[28/Jul/2006:10:27:10 -0100] PUT /geo/1234/places/12/ HTTP/1.0 202 iphone-8
[28/Jul/2006:10:27:10 -0100] PUT /geo/1254/places/12/ HTTP/1.0 202 iphone-7s
[28/Jul/2006:10:27:10 -0100] PUT /geo/1294/places/12/ HTTP/1.0 202 iphone-6
---SERVER RESTART---
[28/Jul/2006:10:27:10 -0400] PUT /cgi-bin/try/ HTTP/1.0 200 iphone-3
[28/Jul/2006:10:27:10 -0500] POST /hidden/ HTTP/1.0 404 iphone-7P
[28/Jul/2006:10:27:10 -0500] POST /hidden/ HTTP/1.0 404 iphone-6s
---SERVER RESTART---
[28/Jul/2006:10:27:10 -0600] GET /users/98763/geo/ HTTP/1.0 504 iphone-6s
[28/Jul/2006:10:27:10 -0700] GET /users/12345/places/ HTTP/1.0 202 iphone-6
[28/Jul/2006:10:27:10 -0700] GET /users/12347/places/ HTTP/1.0 202 iphone-6
[28/Jul/2006:10:27:10 -0700] GET /users/12367/places/ HTTP/1.0 202 iphone-5s
[28/Jul/2006:10:27:10 -0700] GET /users/12387/places/ HTTP/1.0 202 iphone-7s
[28/Jul/2006:10:27:10 -0900] POST /geo/12346/places/4/ HTTP/1.0 202 iphone-X
所需的输出:
"""
verb uri status counts
GET /cgi-bin/try/ 200 1
GET /hidden/ 404 1
GET /users/#/places/ 202 4
POST /geo/#/places/#/ 202 1
POST /hidden/ 404 2
POST /users/#/places/ 202 1
PUT /geo/#/places/#/ 202 3
PUT /users/#/geo/ 504 1
"""
我尝试的代码:
$ cat test.log | cut -d ']' -f2- | sort |head -n -2
GET /cgi-bin/try/ HTTP/1.0 200 iphone-S
GET /hidden/ HTTP/1.0 404 iphone-X
GET /users/12345/places/ HTTP/1.0 202 iphone-6
GET /users/12347/places/ HTTP/1.0 202 iphone-6
GET /users/12367/places/ HTTP/1.0 202 iphone-5s
GET /users/12387/places/ HTTP/1.0 202 iphone-7s
GET /users/98763/geo/ HTTP/1.0 504 iphone-6s
POST /geo/12346/places/4/ HTTP/1.0 202 iphone-X"""
POST /hidden/ HTTP/1.0 404 iphone-6s
POST /hidden/ HTTP/1.0 404 iphone-7P
POST /users/12345/places/ HTTP/1.0 202 iphone-7P
PUT /cgi-bin/try/ HTTP/1.0 200 iphone-3
PUT /geo/1234/places/12/ HTTP/1.0 202 iphone-8
PUT /geo/1254/places/12/ HTTP/1.0 202 iphone-7s
PUT /geo/1294/places/12/ HTTP/1.0 202 iphone-6
PUT /users/98761/geo/ HTTP/1.0 504 iphone-6s
我可以使用uniq -c
来获得最终计数,但是,我一直坚持用#
符号代替中间数字。
答案 0 :(得分:1)
sed
命令使用s!pattern!replacement!g
执行全局搜索和替换。搜索模式/(users|geo|places)/[0-9]+
与/users/
,/geo/
或/places/
匹配,后跟一个数字。替换字符串/\1/#
将原始单词保留在原处,数字更改为#
。
$ awk '/^\[/ {print $3,$4,$6}' test.log |
sed -r 's!/(users|geo|places)/[0-9]+!/\1/#!g' |
sort | uniq -c
1 GET /cgi-bin/try/ 200
1 GET /hidden/ 404
1 GET /users/#/geo/ 504
4 GET /users/#/places/ 202
1 POST /geo/#/places/#/ 202
2 POST /hidden/ 404
1 POST /users/#/places/ 202
1 PUT /cgi-bin/try/ 200
3 PUT /geo/#/places/#/ 202
1 PUT /users/#/geo/ 504
如果您想要给定的确切输出格式,可以使用column
将数据对齐为整齐的列。
$ awk '/^\[/ {print $3,$4,$6}' test.log |
sed -r 's!/(users|geo|places)/[0-9]+!/\1/#!g' |
sort | uniq -c |
{ echo 'verb uri status count'; awk '{print $2,$3,$4,$1}' } |
column -t
verb uri status count
GET /cgi-bin/try/ 200 1
GET /hidden/ 404 1
GET /users/#/geo/ 504 1
GET /users/#/places/ 202 4
POST /geo/#/places/#/ 202 1
POST /hidden/ 404 2
POST /users/#/places/ 202 1
PUT /cgi-bin/try/ 200 1
PUT /geo/#/places/#/ 202 3
PUT /users/#/geo/ 504 1