我需要用一些可选字段解析配置文件,我对它们都不感兴趣。我正在使用Python re.findall
方法。
这是一个配置:
edit 750
set srcintf "port1"
set dstintf "port9"
set srcaddr "addr1" "addr5"
set dstaddr "addr6"
set action accept
set schedule "always"
set service "ICMP_ANY"
set logtraffic enable
set comments "This is the second one"
set nat enable
set ippool enable
set poolname "name1"
next
这是我到目前为止的正则表达式:
r'edit ([\d]+)\s+set srcintf "(.+?)"\s+set dstintf "(.+?)"\s+set srcaddr (.+?)\s+set dstaddr (.+?)\s+set action ([\w]+)\s+(?:set status ([\w]+)\s+)?set schedule "(.+?)"\s+set service (.+?)\s+(?:set .*?\s+)*?(?:set poolname "(.+?)"\s+)?(?:set .*\s+)*?next'
简单地说,我想在set service
之后忽略任何内容,但会产生可选字段poolname
。
我的正则表达式的问题在于(?:set .*?\s+)*?
消耗set poolname
字段,尽管非贪婪标记。
如果poolname
是必需的,那么正则表达式将完美运行,但情况并非如此。有什么想法吗?
答案 0 :(得分:1)
它相当容易,只是引入一个否定的预测(?! .. )
建议使用RegexFormat来处理大型正则表达式
# edit[ ]([\d]+)\s+set[ ]srcintf[ ]"(.+?)"\s+set[ ]dstintf[ ]"(.+?)"\s+set[ ]srcaddr[ ](.+?)\s+set[ ]dstaddr[ ](.+?)\s+set[ ]action[ ]([\w]+)\s+(?:set[ ]status[ ]([\w]+)\s+)?set[ ]schedule[ ]"(.+?)"\s+set[ ]service[ ](.+?)\s+(?:set[ ](?!poolname[ ]".+?").*?\s+)*(?:set[ ]poolname[ ]"(.+?)"\s+)?(?:set[ ].*\s+)*next
edit [ ]
( [\d]+ ) # (1)
\s+ set [ ] srcintf [ ] "
( .+? ) # (2)
" \s+ set [ ] dstintf [ ] "
( .+? ) # (3)
" \s+ set [ ] srcaddr [ ]
( .+? ) # (4)
\s+ set [ ] dstaddr [ ]
( .+? ) # (5)
\s+ set [ ] action [ ]
( [\w]+ ) # (6)
\s+
(?:
set [ ] status [ ]
( [\w]+ ) # (7)
\s+
)?
set [ ] schedule [ ] "
( .+? ) # (8)
" \s+ set [ ] service [ ]
( .+? ) # (9)
\s+
(?:
set [ ]
(?! poolname [ ] " .+? " )
.*?
\s+
)*
(?:
set [ ] poolname [ ] "
( .+? ) # (10)
" \s+
)?
(?: set [ ] .* \s+ )*
next
Perl测试用例
$/ = undef;
$str = <DATA>;
while ( $str =~ /edit[ ]([\d]+)\s+set[ ]srcintf[ ]"(.+?)"\s+set[ ]dstintf[ ]"(.+?)"\s+set[ ]srcaddr[ ](.+?)\s+set[ ]dstaddr[ ](.+?)\s+set[ ]action[ ]([\w]+)\s+(?:set[ ]status[ ]([\w]+)\s+)?set[ ]schedule[ ]"(.+?)"\s+set[ ]service[ ](.+?)\s+(?:set[ ](?!poolname[ ]".+?").*?\s+)*(?:set[ ]poolname[ ]"(.+?)"\s+)?(?:set[ ].*\s+)*next/g )
{
print "----------------------\n";
print "1 = $1\n";
print "2 = $2\n";
print "3 = $3\n";
print "4 = $4\n";
print "5 = $5\n";
print "6 = $6\n";
print "7 = $7\n";
print "8 = $8\n";
print "9 = $9\n";
print "Poolname = $10\n";
}
__DATA__
edit 750
set srcintf "port1"
set dstintf "port9"
set srcaddr "addr1" "addr5"
set dstaddr "addr6"
set action accept
set schedule "always"
set service "ICMP_ANY"
set logtraffic enable
set comments "This is the second one"
set nat enable
set ippool enable
set poolname "name1"
next
输出&gt;&gt;
----------------------
1 = 750
2 = port1
3 = port9
4 = "addr1" "addr5"
5 = "addr6"
6 = accept
7 =
8 = always
9 = "ICMP_ANY"
Poolname = name1