我需要一些帮助,我有一个文件,在其列上分别包含主机IP和端口,因此该文件看起来像这样
Timestamp: 1573678793 Host: 192.168.0.1 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.1 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.3 Ports: 8080/open/tcp/
所以我想以这种格式grep主机和端口:
192.168.0.1 80,443
192.168.0.2 80,443
192.168.0.3 8080
任何知道如何使用awk和grep都能实现的人,也请向我解释一下语法,谢谢。
我尝试过的;
获取不同文件上的主机和端口,然后使用paste命令将它们粘贴到新文件中,但是问题在于IP重复使用不同的端口,我很想制作数据干净。
我已经用Google搜索了,找到了执行该操作的命令:cat ips-ports | | grep Host | awk '{print $2,$7}' | sed 's@/.*@@' | sort -t' ' -n -k2 | awk -F' ' -v OFS=' ' '{x=$1;$1="";a[x]=a[x]","$0}END{for(x in a) print x,a[x]}' | sed 's/, /,/g' | sed 's/ ,/ /' | sort -V -k1 | cut -d " " -f2
但是我很想了解它的作用,因为在我的文件中它无法按预期工作。
答案 0 :(得分:1)
请您尝试以下。
awk '
BEGIN{
OFS=","
}
{
match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/)
split($NF,array,"/")
val=substr($0,RSTART,RLENGTH)
a[val]=(a[val]?a[val] OFS:"")array[1]
}
END{
for(i in a){
print i FS a[i]
}
}
' Input_file
说明: :添加了上述代码的详细说明。
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
OFS="," ##Set OFS as comma here.
} ##Closing BLOCK for BEGIN section here.
{
match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/) ##Using match function ti match IP regex here.
split($NF,array,"/") ##Splitting last field into an array named array with delimiter /
val=substr($0,RSTART,RLENGTH) ##Creating a variable named val whose value is sub-string of line with starting point RSTART to RLENGTH.
a[val]=(a[val]?a[val] OFS:"")array[1] ##Creating an array named a with index val and concatenate it with its own values.
}
END{ ##Starting END BLOCK for this awk program.
for(i in a){ ##Starting for loop here.
print i FS a[i] ##Printing variable i, FS and value of array a with index i here.
} ##Closing BLOCK for, for loop here.
} ##Closing BLOCK for END section of this program here.
' Input_file ##Mentioning Input_file here.
答案 1 :(得分:1)
更多awk
输出:
$ awk -F '[ /]' '{arr[$4]=$4 in arr?arr[$4]","$6:$6}END{for(i in arr)print i,arr[i]}' infile
192.168.0.1 80,443
192.168.0.2 80,443
192.168.0.3 8080
输入:
$ cat infile
Timestamp: 1573678793 Host: 192.168.0.1 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.1 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.3 Ports: 8080/open/tcp/
更好的可读版本:
awk -F '[ /]' '{
arr[$4] = $4 in arr ? arr[$4] "," $6 : $6
}
END{
for(i in arr)
print i,arr[i]
}' infile
答案 2 :(得分:0)
紧跟sed
和awk
的短管道之后:
# first `sed` with a regex extract the host and ports:
sed 's/.*Host:[[:blank:]]*\([^[:blank:]]*\)[[:blank:]]*Ports:[[:blank:]]*\([0-9]*\)\/.*/\1 \2/' |
# then awk to join the fields with a comma:
awk '{ a[$1] = a[$1] (a[$1]?",":"") $2 } END{ for (i in a) print i, a[i] }'
具有以下输入内容:
Timestamp: 1573678793 Host: 192.168.0.1 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.1 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 80/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.2 Ports: 443/open/tcp/
Timestamp: 1574833457 Host: 192.168.0.3 Ports: 8080/open/tcp/
输出:
192.168.0.1 80,443
192.168.0.2 80,443
192.168.0.3 8080
在repl上进行了测试。