如何从日志文件中过滤IP

时间:2019-10-18 10:28:06

标签: awk grep

我有一个来自apache服务器的日志文件,我想知道连接到apache服务器到/ web / Service的每个唯一IP地址。

从示例中,我需要最后一个IP地址,并且该地址位于localhost.hosting.corp前面 我认为这需要用grep / awk来完成,但是我不是那么好:)

在日志文件的样本下面,我用其他东西替换了真实IP:

1.9.2.2 - - [18/Oct/2019:11:53:54 +0200] "POST /web/Service HTTP/1.1" 200 793 "-" "Mozilla/4.0 (compatible; MSIE 6.0; MS Web Services Client Protocol 2.0.50727.8810)" "1.2.2.10" localhost.hosting.corp:80
1.9.2.2 - - [18/Oct/2019:11:53:55 +0200] "POST /web2/Service HTTP/1.1" 200 791 "-" "-" "1.2.1.4" localhost.hosting.corp:80
1.9.2.2 - - [18/Oct/2019:11:54:12 +0200] "POST /web/Service HTTP/1.1" 200 793 "-" "Mozilla/4.0 (compatible; MSIE 6.0; MS Web Services Client Protocol 2.0.50727.8810)" "1.2.1.7" localhost.hosting.corp:80
1.9.2.2 - - [18/Oct/2019:11:54:38 +0200] "POST /web/Service HTTP/1.1" 200 791 "-" "-" "1.2.1.4" localhost.hosting.corp:80
1.9.2.2 - - [18/Oct/2019:11:54:41 +0200] "POST /web/Service HTTP/1.1" 200 672 "-" "-" "1.2.1.4" localhost.hosting.corp:80```

1 个答案:

答案 0 :(得分:1)

您可以使用awk进行过滤,提取最后一个$(NF-1)之前的列,去除'“'(带有gensub),并仅打印第一个出现的字段(使用x [ip])

awk '
/\/web\/Service/ {
      ip=gensub("\"", "", "g", $(NF-1)) ;
      if (!x[ip] ) { print ip ; x[ip]++ }
}' log.txt