我有一个语法,我试图在Regexp :: Grammars的帮助下解析,但由于某种原因,它看起来像是有空白问题。我设法将其减少到以下几点:
use Modern::Perl;
use v5.16;
use Regexp::Grammars;
use Data::Dumper;
my $grammar = qr{
<foo> <baz> | my <foo> is <baz>
<rule: foo> foo | fu | phoo
<rule: baz> bazz?
}ix;
while (<>) {
chomp;
if (/$grammar/) {
say Dumper(\%/);
}
else {
say "NO MATCH!!\n";
}
}
运行程序时和任何匹配的序列,如
foo baz
phoo bazz
my fu is baz
进入程序返回
NO MATCH!!
但是,如果我在语法定义之前插入一个debugging指令:
<debug: match>
<foo> <baz> | my <foo> is <baz>
...
我得到了我的期望:
perl.exe : ========> Trying <grammar> from position 0
At line:1 char:5
+ perl <<<< .\test_grammar2.pl 2>&1 > output.txt
+ CategoryInfo : NotSpecified: (========> Tryin...from position 0:String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
phoo bazz |...Trying <foo>
| |...Trying subpattern /foo/
| | \FAIL subpattern /foo/
| |...Trying next alternative
| |...Trying subpattern /fu/
| | \FAIL subpattern /fu/
| |...Trying next alternative
| |...Trying subpattern /phoo/
bazz | | \_____subpattern /phoo/ matched 'phoo'
| \_____<foo> matched 'phoo'
|...Trying <baz>
| |...Trying subpattern /bazz?/
[eos] | | \_____subpattern /bazz?/ matched 'bazz'
| \_____<baz> matched ' bazz'
\_____<grammar> matched 'phoo bazz'
$VAR1 = {
'' => 'phoo baz',
baz => ' bazz',
foo => 'phoo'
};
同样,如果我在子规则和文字调用之间放置一个可选的空格序列:
<foo>\s*<baz> ...
...
我也得到了一场比赛。
我使用的是Winodws 7,ActivePerl Build 1603,Perl 5.16.3和PowerShell。我已经尝试过使用cmd.exe以防万一有一些模糊的PowerShell问题,但我遇到了同样的问题。我也试过直接匹配:
my $s = q(fu baz);
if ($s =~ $grammar) {
...
}
但我遇到了同样的问题 - 使用相同的解决方案。
编辑:我学到了什么。使用Regexp :: Grammars模块时,如果你的语法需要文字,子规则或两者之间的空格,那么你需要封装:
<foobaz>
<rule: foobaz> <foo> <baz> | my <foo> is <baz>
逃逸:
<foo>\ <baz> | my\ <foo>\ is\ <baz>
或插入空格序列:
<foo>\s+<baz> | my\s+<foo>\s+is\s+<baz>
答案 0 :(得分:2)
好的,我弄清楚问题是什么。 Regexp :: Grammars表达式中的顶级匹配在令牌模式(空格未被忽略)中处理,而不是在规则模式(忽略空格)中处理。所以,要获得你想要的东西,你只需要添加一个最高规则:
my $grammar = qr{
<top>
<rule: top> <foo> <baz> |
my <foo> is <baz>
<rule: foo> foo | fu | phoo
<rule: baz> bazz?
}ix;
这是我的完整计划:
use Modern::Perl;
use v5.16;
use Regexp::Grammars;
use Data::Dumper;
my $grammar = qr{
<top>
<rule: top> <foo> <baz> |
my <foo> is <baz>
<rule: foo> foo | fu | phoo
<rule: baz> bazz?
}ix;
1;
while (<>) {
chomp;
if (/$grammar/) {
say Dumper(\%/);
}
else {
say "NO MATCH!!\n";
}
}
这是我的输出:
% echo FU baz | perl grammar.pl
$VAR1 = {
'' => 'FU baz',
'top' => {
'' => 'FU baz',
'baz' => 'baz',
'foo' => 'FU'
}
};
% echo my phoo is bazz | perl grammar.pl
$VAR1 = {
'' => 'my phoo is bazz',
'top' => {
'' => 'my phoo is bazz',
'baz' => 'bazz',
'foo' => 'phoo'
}
};
Regexp :: Grammars的文档明确指出顶级是在令牌模式下完成的。添加顶级令牌只会在解析树中添加一个图层,但我不认为您可以选择是否在顶层忽略空格。