Question

我需要一些帮助，我有一个文件，在其列上分别包含主机IP和端口，因此该文件看起来像这样

Timestamp: 1573678793 Host: 192.168.0.1 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.1 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.3 Ports: 8080/open/tcp/

所以我想以这种格式grep主机和端口：

192.168.0.1  80,443
192.168.0.2  80,443
192.168.0.3  8080

任何知道如何使用awk和grep都能实现的人，也请向我解释一下语法，谢谢。

我尝试过的；

获取不同文件上的主机和端口，然后使用paste命令将它们粘贴到新文件中，但是问题在于IP重复使用不同的端口，我很想制作数据干净。
我已经用Google搜索了，找到了执行该操作的命令：cat ips-ports | | grep Host | awk '{print $2,$7}' | sed 's@/.*@@' | sort -t' ' -n -k2 | awk -F' ' -v OFS=' ' '{x=$1;$1="";a[x]=a[x]","$0}END{for(x in a) print x,a[x]}' | sed 's/, /,/g' | sed 's/ ,/ /' | sort -V -k1 | cut -d " " -f2

但是我很想了解它的作用，因为在我的文件中它无法按预期工作。

Answer 1

请您尝试以下。

awk '
BEGIN{
  OFS=","
}
{
  match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/)
  split($NF,array,"/")
  val=substr($0,RSTART,RLENGTH)
  a[val]=(a[val]?a[val] OFS:"")array[1]
}
END{
  for(i in a){
    print i FS a[i]
  }
}
' Input_file

说明： ：添加了上述代码的详细说明。

awk '                                              ##Starting awk program from here.
BEGIN{                                             ##Starting BEGIN section from here.
  OFS=","                                          ##Set OFS as comma here.
}                                                  ##Closing BLOCK for BEGIN section here.
{
  match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/)       ##Using match function ti match IP regex here.
  split($NF,array,"/")                             ##Splitting last field into an array named array with delimiter /
  val=substr($0,RSTART,RLENGTH)                    ##Creating a variable named val whose value is sub-string of line with starting point RSTART to RLENGTH.
  a[val]=(a[val]?a[val] OFS:"")array[1]            ##Creating an array named a with index val and concatenate it with its own values.
}
END{                                               ##Starting END BLOCK for this awk program.
  for(i in a){                                     ##Starting for loop here.
    print i FS a[i]                                ##Printing variable i, FS and value of array a with index i here.
  }                                                ##Closing BLOCK for, for loop here.
}                                                  ##Closing BLOCK for END section of this program here.
'  Input_file                                      ##Mentioning Input_file here.

Answer 2

更多awk

输出：

$ awk -F '[ /]' '{arr[$4]=$4 in arr?arr[$4]","$6:$6}END{for(i in arr)print i,arr[i]}' infile
192.168.0.1 80,443
192.168.0.2 80,443
192.168.0.3 8080

输入：

$ cat infile
Timestamp: 1573678793 Host: 192.168.0.1 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.1 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.3 Ports: 8080/open/tcp/

更好的可读版本：

awk -F '[ /]' '{
                arr[$4] = $4 in arr ? arr[$4] "," $6 : $6
              }
           END{
                for(i in arr)
                   print i,arr[i]
              }' infile

Answer 3

紧跟sed和awk的短管道之后：

# first `sed` with a regex extract the host and ports:
sed 's/.*Host:[[:blank:]]*\([^[:blank:]]*\)[[:blank:]]*Ports:[[:blank:]]*\([0-9]*\)\/.*/\1 \2/' |
# then awk to join the fields with a comma:
awk '{ a[$1] = a[$1] (a[$1]?",":"") $2 }  END{ for (i in a) print i, a[i] }'

具有以下输入内容：

Timestamp: 1573678793 Host: 192.168.0.1 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.1 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.3 Ports: 8080/open/tcp/

输出：

192.168.0.1 80,443
192.168.0.2 80,443
192.168.0.3 8080

在repl上进行了测试。

具有IP和端口的GREP和AWK文件

3 个答案: