我正在尝试使用perl regex捕获此内容中的数据:
variable_myname(variable_data);
所以我用过:
variable_([A-Za-z_]+)(\s+)?\((.*?)\)
这使我能够捕获变量的myname(也以variable_
为前缀)以及(...)
内的数据。
但是,如果用户使用(允许)语法:
,则不起作用variable_oneexp("This is a value ( ... ) ");
由于"
,(
和)
应该被忽略。
如果使用'
,则应处理相同的行为:
variable_twoexp('This is a value ( ... ) ');
最后,还应支持此行为:
variable_threeexp('This is a value ' + ' another string ');
尽管如此,我不认为最后一个例子会对正则表达式产生影响。
赞赏一些指示/帮助。
答案 0 :(得分:0)
您可以使用否定类而不是惰性.*?
,然后使用一些替换来匹配单/双引号之间的任何内容:
variable_([A-Za-z_]+)\s*\(((?:'[^']+'|"[^"]+"|[^()])+)\)
我还删除了\s+
周围的捕获组,并将其转换为\s*
,因为我认为您不需要捕获组中的空格。如果需要,请将其还原。
答案 1 :(得分:0)
我不知道您的用例,因此简单的正则表达式就足以满足您的要求。但是,可以更完全地匹配这一点。下面演示了OP的正则表达式,Jerry的正则表达式,以及我自己对6个不同的示例函数。最后3个正则表达式是Jarry更简单的解决方案失败的例子。
我使用/x
修饰符在正则表达式中包含了空格,以便于阅读:
use strict;
use warnings;
while (<DATA>) {
chomp;
print "$. - $_\n";
# Original OP
if (/variable_([A-Za-z_]+)\s*\((.*?)\)/) {
printf "OP: <%s>, <%s>\n", $1, $2;
}
# Jarry's Answer
if (/variable_([A-Za-z_]+)\s*\(((?:'[^']+'|"[^"]+"|[^()])+)\)/) {
printf "A1: <%s>, <%s>\n", $1, $2;
}
# Covers all standard single and double quoted strings and parenthesis
if (/variable_([A-Za-z_]+)\s*\((
(?:
(?> [^'"()]+ )
|
' (?: (?>[^'\\]+) | \\. )* '
|
" (?: (?>[^"\\]+) | \\. )* "
|
\( (?2) \)
)*
)\)/x) {
printf "A2: <%s>, <%s>\n", $1, $2;
}
print "\n";
}
__DATA__
variable_oneexp("This is a value ( ... ) ");
variable_twoexp('This is a value ( ... ) ');
variable_threeexp('This is a value ' + ' another string ');
variable_fourexp(' \') <-- a paren');
variable_fiveexp(mysub('value'), 'value2');
variable_sixexp('This is a )(value ', variable_five("testing ()"), "st\")ring", (3-2)/1);
输出:
1 - variable_oneexp("This is a value ( ... ) ");
OP: <oneexp>, <"This is a value ( ... >
A1: <oneexp>, <"This is a value ( ... ) ">
A2: <oneexp>, <"This is a value ( ... ) ">
2 - variable_twoexp('This is a value ( ... ) ');
OP: <twoexp>, <'This is a value ( ... >
A1: <twoexp>, <'This is a value ( ... ) '>
A2: <twoexp>, <'This is a value ( ... ) '>
3 - variable_threeexp('This is a value ' + ' another string ');
OP: <threeexp>, <'This is a value ' + ' another string '>
A1: <threeexp>, <'This is a value ' + ' another string '>
A2: <threeexp>, <'This is a value ' + ' another string '>
4 - variable_fourexp(' \') <-- a paren');
OP: <fourexp>, <' \'>
A1: <fourexp>, <' \'>
A2: <fourexp>, <' \') <-- a paren'>
5 - variable_fiveexp(mysub('value'), 'value2');
OP: <fiveexp>, <mysub('value'>
A2: <fiveexp>, <mysub('value'), 'value2'>
6 - variable_sixexp('This is a )(value ', variable_five("testing ()"), "st\")ring", (3-2)/1);
OP: <sixexp>, <'This is a >
A1: <sixexp>, <'This is a >
A2: <sixexp>, <'This is a )(value ', variable_five("testing ()"), "st\")ring", (3-2)/1>