Question

我有以下语法：

SPACE : (' '|'\t'|'\n'|'\r')+ {$channel = HIDDEN;};
NAME_TAG : 'name';
IS_TAG : 'is';

START : 'START';
END : ('END START') => 'END START'  ;

WORD    : 'A'..'Z'+;

rule :  START NAME_TAG IS_TAG WORD END;

并希望解析如下语言：“START name is END END START”。这里的问题是END令牌，因为'END'（Word + SPACE）被误解了。我认为这里的正确方法是使用语法谓词（END-token），但也许我错了。

Answer 1

我不会创建由空格分隔的2（或更多）WORD个标记。为什么不将'END'标记为END - 令牌，然后执行以下操作：

rule     : START NAME_TAG IS_TAG word END START;
word     : WORD | END; // expand this rule, as you see fit
NAME_TAG : 'name';
IS_TAG   : 'is';
START    : 'START';
END      : 'END';
WORD     : 'A'..'Z'+;
SPACE    : (' '|'\t'|'\n'|'\r')+ {$channel = HIDDEN;};

将"START name is END END START"解析为以下解析树：

enter image description here

修改

你做错了是不给词法分析器规则在谓词失败时恢复的可能性。这是对谓词的正确使用：

rule     :  START NAME_TAG IS_TAG WORD END;

SPACE    : (' '|'\t'|'\n'|'\r')+ {$channel = HIDDEN;};
NAME_TAG : 'name';
IS_TAG   : 'is';
START    : 'START';
WORD     : ('END START')=> 'END START' {$type=END;}
         | 'A'..'Z'+
         ;

fragment END : ;

Antlr句法谓词 - 不匹配的字符

1 个答案:

修改