我正在尝试使用以下正则表达式验证excel公式样式:
=SUM\(((?:\w+\d+)(?::\w+\d+)?)((?:,\w+\d+)(?::\w+\d+)?)*\)
关于这个来源:
应该通过
=SUM(A1,A11:A212,A12:A56,A342:A12,A3)
=SUM(A11:A12,A12:a12,A34:A3)
=SUM(A1,A2,A3)
=SUM(A1)
应该失败
=SUM(A11:A212:A2,A12:A56,A4,A342:A12)
我的验证部分正常工作,但我无法弄清楚如何对每个逗号的值进行分组。他们应该是:
我希望如何对它们进行分组:
=SUM(A1,A11:A12,A12:A56,A3) // Groups: $1 = A1 $2 = A11:A12 $3 = A12:A56 $4 = A3
=SUM(A11:A12,A10:A12,A34:A3) // Groups: $1 = A11:A12 $2 = A10:A12 $3 = A34:A3
=SUM(A1,A2,A3) //Groups: $1 = A1 $2 = A2 $3 = A3
=SUM(A1) //Groups: $1 = A1
目前如何分组:
=SUM(A1,A11:A12,A12:A56,A3) // Groups: $1 = A1 $2 = A3
=SUM(A11:A12,A10:A12,A34:A3) // Groups: $1 = A11:A12 $2 = A34:A3
=SUM(A1,A2,A3) //Groups: $1 = A1 $2 = A3
=SUM(A1) //Groups: $1 = A1
注意,它将第一个和最后一个分组。我对REGEX很新,所以如果我在这里做些糟糕的事情,请指出我正确的方向。谢谢!
答案 0 :(得分:1)
这是不可能的:(...)(?:,(...))+
(2组)总是会产生2场比赛,无论+
匹配多少。
您需要(至少)执行以下两个步骤:
value := /\w+\d+(?::\w+\d+)?/
value_list := /value(?:,value)*/
expression := /=SUM\((value_list)\)/
现在匹配expression
中的第1组(value_list
),并在此匹配中找到所有value
次出现。
PHP快速演示:
$text = 'should pass
=SUM(A1,A11:A212,A12:A56,A342:A12,A3)
=SUM(A11:A12,A12:a12,A34:A3)
=SUM(A1,A2,A3)
=SUM(A1)
should fail
=SUM(A11:A212:A2,A12:A56,A4,A342:A12)';
$value = "\w+\d+(?::\w+\d+)?";
$value_list = "$value(?:,$value)*";
$expression = "=SUM\(($value_list)\)";
preg_match_all("/$expression/", $text, $matches);
// iterate over $value_list from $expression (group 1)
foreach($matches[1] as $group1) {
preg_match_all("/$value/", $group1, $m);
print_r($m);
}
打印:
Array ( [0] => Array ( [0] => A1 [1] => A11:A212 [2] => A12:A56 [3] => A342:A12 [4] => A3 ) ) Array ( [0] => Array ( [0] => A11:A12 [1] => A12:a12 [2] => A34:A3 ) ) Array ( [0] => Array ( [0] => A1 [1] => A2 [2] => A3 ) ) Array ( [0] => Array ( [0] => A1 ) )
答案 1 :(得分:0)
我实际上会先拆分字符串。类似的东西:
sub IsFormulaValid
{
my $str = $_[0];
(my $match) = $str =~ /^=SUM\(([^)]+)\)$/;
my @sumArgs = split(/,\s*/, $match);
my $valid = 1;
foreach(@sumArgs){
if($_ !~ /^[a-z]+\d+(?::[a-z]+\d+){0,1}$/i){
$valid = 0;
last;
}
}
return $valid;
}
请注意,您还可以检查匹配本身的有效性,以及@sumArgs>设置$valid
时为0。使用您的输入在perl中进行测试:
my @testInput;
push(@testInput,'=SUM(A1,A11:A212,A12:A56,A342:A12,A3)');
push(@testInput,'=SUM(A11:A12,A12:a12,A34:A3)');
push(@testInput,'=SUM(A1,A2,A3)');
push(@testInput,'=SUM(A1)');
push(@testInput,'=SUM(A11:A212:A2,A12:A56,A4,A342:A12)');
foreach(@testInput){
print "'$_'\n ";
print 'NOT ' if !IsFormulaValid($_);
print "VALID\n\n";
}
结果:
'=SUM(A1,A11:A212,A12:A56,A342:A12,A3)'
VALID
'=SUM(A11:A12,A12:a12,A34:A3)'
VALID
'=SUM(A1,A2,A3)'
VALID
'=SUM(A1)'
VALID
'=SUM(A11:A212:A2,A12:A56,A4,A342:A12)'
NOT VALID