我使用antlr4 + python目标匹配这样的短语,
select 1 from dual where id=.0union select 1
令牌是:
['select', '1', 'from', 'dual', 'where', 'id', '=', '.0union', 'select', '1']
我的问题是,.0
和union
令牌已合并为一个令牌,即.0union
,而antlr报告此类错误,
line 1:32 mismatched input '=' expecting {<EOF>, '&&', <INVALID>, ';', <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>, <INVALID>}
有关调试的任何想法吗?
有没有办法调试令牌匹配过程?
答案 0 :(得分:1)
正如我们在私人讨论中发现的那样,这个问题与如何在语法中定义点标识符规则有关。在.0union
和.union
等输入之间存在冲突。第一个应该被视为十进制数字和关键字,而第二个形式应该作为一个整体并标记为点标识符。因此,解决方案是不允许点标识符中的点后面的数字(总是必须解析为十进制):
FLOAT_NUMBER: DECIMAL_NUMBER [eE] (MINUS_OPERATOR | PLUS_OPERATOR)? DIGITS;
DECIMAL_NUMBER: DIGITS? DOT_SYMBOL DIGITS;
// Special rule that should also match all keywords if they are directly preceded by a dot.
// Hence it's defined before all keywords.
DOT_IDENTIFIER: DOT_SYMBOL LETTER_WHEN_UNQUOTED_NO_DIGIT LETTER_WHEN_UNQUOTED*;
fragment LETTER_WHEN_UNQUOTED:
DIGIT
| LETTER_WHEN_UNQUOTED_NO_DIGIT
;
fragment LETTER_WHEN_UNQUOTED_NO_DIGIT:
[a-zA-Z_$\u0080-\uffff]
;