如何搜索文本中的所有短语并将其替换为另一个短语(IPv4地址搜索和验证)

时间:2013-11-06 15:02:16

标签: regex perl bash sed awk

如何搜索文本中的所有短语并将其替换为另一个短语

(by using sed and/or awk and/or grep and/or perl) ?

它是关于将一​​个IPv4地址替换为另一个IPv4地址,并验证找到的值是否是正确的IPv4地址。

以下是一个例子:

ip_node:<ip_address> e.g. ip_node:192.168.0.1

这些应该替换为

ip_address:<ip_address>” e.g. "ip_address:192.168.0.1"

其他假设:

-> Phrases such 
   192.168.000.001, 072.12.01.1, 256.224.1.010, 20.128.300.01
   and similar aren't propoer IPv4 addresses and should be marked as
   invalid IPv4 address and possibly write to another file.
-> Phrases containing less/more than 4 octets aren't also considered
   to be a valid IPv4 addresses and like above should be marked as
   invalid IPv4 address and possibly write to another file.
-> Phrases containing any character other than a digit [0..9] aren't also
   considered to be a valid IPv4 addresses and like above should be marked as
   invalid IPv4 address and possibly write to another file.

样本文件的内容:

ip_node:192.168.0.1
ip_node:192.268.0.01
ip_node:10.0.0.0
ip_node:10.0.0000.10
ip_node:10.255.255.255
ip_node:10.255.255.255.12
ip_node:172.16.0.0
ip_node:172.16.0
ip_node:172.31.255.255
ip_node:172.31.255.
ip_node:0.0.0.0
ip_node:01.0.01.0
ip_node:255.255.255.255
ip_node:255.259.255.259
ip_node:224.0.0.0
ip_node:224.0.
ip_node:207.142.131.236
ip_node:207.002.001.06
ip_node:255.255.255.0
ip_node:055.2255.1255.0
ip_node:204.144.134.234
ip_node:2o7.0o2.0E.O6
ip_node:245.245.245.40
ip_node:O55.2255.1255.a0

输出file1-all(包含所有条目)的内容:

ip_address:192.168.0.1
[!]ip_node:192.268.0.01  -- [invalid IP address]
ip_address:10.0.0.0
[!]ip_node:10.0.0000.10  -- [invalid IP address]
ip_address:10.255.255.255
[!]ip_node:10.255.255.255.12  -- [invalid IP address]
ip_address:172.16.0.0
[!]ip_node:172.16.0  -- [invalid IP address]
ip_address:172.31.255.255
[!]ip_node:172.31.255.  -- [invalid IP address]
ip_address:0.0.0.0
[!]ip_node:01.0.01.0  -- [invalid IP address]
ip_address:255.255.255.255
[!]ip_node:255.259.255.259  -- [invalid IP address]
ip_address:224.0.0.0
[!]ip_node:224.0.  -- [invalid IP address]
ip_address:207.142.131.236
[!]ip_node:207.002.001.06  -- [invalid IP address]
ip_address:255.255.255.0
[!]ip_node:055.2255.1255.0  -- [invalid IP address]
ip_address:204.144.134.234
[!]ip_node:2o7.0o2.0E.O6  -- [invalid IP address]
ip_address:245.245.245.40
[!]ip_node:O55.2255.1255.a0  -- [invalid IP address]

输出文件2的内容 - 坏(只有错误的条目):

[!]ip_node:192.268.0.01  -- [invalid IP address]
[!]ip_node:10.0.0000.10  -- [invalid IP address]
[!]ip_node:10.255.255.255.12  -- [invalid IP address]
[!]ip_node:172.16.0  -- [invalid IP address]
[!]ip_node:172.31.255.  -- [invalid IP address]
[!]ip_node:01.0.01.0  -- [invalid IP address]
[!]ip_node:255.259.255.259  -- [invalid IP address]
[!]ip_node:224.0.  -- [invalid IP address]
[!]ip_node:207.002.001.06  -- [invalid IP address]
[!]ip_node:055.2255.1255.0  -- [invalid IP address]
[!]ip_node:2o7.0o2.0E.O6  -- [invalid IP address]
[!]ip_node:O55.2255.1255.a0  -- [invalid IP address]

输出文件的内容3 - 好(仅限有效条目):

ip_address:192.168.0.1
ip_address:10.0.0.0
ip_address:10.255.255.255
ip_address:172.16.0.0
ip_address:172.31.255.255
ip_address:0.0.0.0
ip_address:255.255.255.255
ip_address:224.0.0.0
ip_address:207.142.131.236
ip_address:255.255.255.0
ip_address:204.144.134.234
ip_address:245.245.245.40

尝试:

sed -i -e "s/ip_node:/ip_address:/g" <file>

和此:

sed -i -e "s/ip_node:^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][‌​0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$/ip_address:^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0‌​-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$/g" <file>

2 个答案:

答案 0 :(得分:2)

“01.0.01.0”或“207.002.001.06”有什么问题?为什么你认为它们无效?

perl -MRegexp::Common -nle '
    if (/ip_node:\K(\S+)/ && $1 =~ /^($RE{net}{IPv4})$/) {
        print "ip_address", $1;
    } else {
        print "[!]", $_;
    }
' filename 

输出

ip_address192.168.0.1
[!]ip_node:192.268.0.01
ip_address10.0.0.0
[!]ip_node:10.0.0000.10
ip_address10.255.255.255
[!]ip_node:10.255.255.255.12
ip_address172.16.0.0
[!]ip_node:172.16.0
ip_address172.31.255.255
[!]ip_node:172.31.255.
ip_address0.0.0.0
ip_address01.0.01.0
ip_address255.255.255.255
[!]ip_node:255.259.255.259
ip_address224.0.0.0
[!]ip_node:224.0.
ip_address207.142.131.236
ip_address207.002.001.06
ip_address255.255.255.0
[!]ip_node:055.2255.1255.0
ip_address204.144.134.234
[!]ip_node:2o7.0o2.0E.O6
ip_address245.245.245.40
[!]ip_node:O55.2255.1255.a0

为了将其拆分成您想要的输出文件,我会:

perl -MRegexp::Common -nle '... as above ...' filename |
tee file1-all |
awk '/^\[!\]/ {print > "file2-bad"; next} {print > "file3-good"}'

答案 1 :(得分:0)

awk解决方案:

awk -f check.awk input.txt

其中check.awk

{
    split($0,a,":")  #splits line at ":"
    ip=a[2]          #part after ":"
    if (validIP(ip))
        printf "ip_address:%s%s", ip, ORS  
    else 
        printf "[!]%s%s%s",$0," -- [invalid IP address]", ORS

}

function validIP(ip, a, n, i) { #last three parameters are local variables
    n=split(ip,a,".")           #splits ip in 4 octets
    if (n==4) {
        for (i=1; i<=4; i++) {
            if (!(match(a[i],/^[0-9]+$/))) return (0)  #must be digits
            if (match(a[i],/^0[0-9]+/)) return (0)     #no leading zeros
            if (a[i]>255) return (0)
        }
        return (1)
    }
    else return(0)
}