antlr4 + python:调试令牌匹配

时间:2016-10-24 03:46:36

标签: python antlr antlr4

我使用antlr4 + python目标匹配这样的短语,

select 1 from dual where id=.0union select 1

令牌是:

['select', '1', 'from', 'dual', 'where', 'id', '=', '.0union', 'select', '1']

我的问题是,.0union令牌已合并为一个令牌,即.0union,而antlr报告此类错误,

line 1:32 mismatched input '=' expecting {<EOF>, '&&', <INVALID>, ';', <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>}

有关调试的任何想法吗?

有没有办法调试令牌匹配过程?

1 个答案:

答案 0 :(得分:1)

正如我们在私人讨论中发现的那样,这个问题与如何在语法中定义点标识符规则有关。在.0union.union等输入之间存在冲突。第一个应该被视为十进制数字和关键字,而第二个形式应该作为一个整体并标记为点标识符。因此,解决方案是不允许点标识符中的点后面的数字(总是必须解析为十进制):

FLOAT_NUMBER: DECIMAL_NUMBER [eE] (MINUS_OPERATOR | PLUS_OPERATOR)? DIGITS;
DECIMAL_NUMBER: DIGITS? DOT_SYMBOL DIGITS;

// Special rule that should also match all keywords if they are directly preceded by a dot.
// Hence it's defined before all keywords.
DOT_IDENTIFIER: DOT_SYMBOL LETTER_WHEN_UNQUOTED_NO_DIGIT LETTER_WHEN_UNQUOTED*;

fragment LETTER_WHEN_UNQUOTED:
    DIGIT
    | LETTER_WHEN_UNQUOTED_NO_DIGIT
;

fragment LETTER_WHEN_UNQUOTED_NO_DIGIT:
    [a-zA-Z_$\u0080-\uffff]
;