我决定将C#官方语法翻译为 antlr v4 。但是,在测试时我遇到了以下问题。给定的语法与\n\ntrue\n\n<EOF>
之类的简单单词不匹配。它一直在说mismatched input '\n\ntrue\n\n' expecting Literal
。即使我将Literal
的定义保留为Literal: BooleanLiteral;
,输入\n\ntrue\n\n<EOF>
仍然无法匹配。我期待语法跳过\n
s true
和<EOF>
,但显然这种情况并没有发生。试图调试,但仍然无法找到任何错误。有什么想法吗?
grammar Test;
start: Literal EOF;
/**********
*
* Literals
*
**********/
Literal
: BooleanLiteral
| IntegerLiteral
| RealLiteral
| CharacterLiteral
| StringLiteral
| NullLiteral
;
BooleanLiteral
: 'true'
| 'false'
;
IntegerLiteral
: DecimalIntegerLiteral
| HexadecimalIntegerLiteral
;
DecimalIntegerLiteral
: DecimalDigits IntegerTypeSuffix?
;
DecimalDigits
: DecimalDigit+
;
DecimalDigit
: [0-9]
;
IntegerTypeSuffix
: 'U'
| 'u'
| 'L'
| 'l'
| 'UL'
| 'Ul'
| 'uL'
| 'ul'
| 'LU'
| 'Lu'
| 'lU'
| 'lu'
;
HexadecimalIntegerLiteral
: ('0x' | '0X') HexDigits IntegerTypeSuffix?
;
HexDigits
: HexDigit+
;
HexDigit
: [0-9A-Fa-f]
;
RealLiteral
: DecimalDigits '.' DecimalDigits ExponentPart? RealTypeSuffix?
| '.' DecimalDigits ExponentPart? RealTypeSuffix?
| DecimalDigits ExponentPart RealTypeSuffix?
| DecimalDigits RealTypeSuffix
;
ExponentPart
: ('e' | 'E') Sign? DecimalDigits
;
Sign
: '+'
| '-'
;
RealTypeSuffix
: 'F'
| 'f'
| 'D'
| 'd'
| 'M'
| 'm'
;
CharacterLiteral
: '\'' Character '\''
;
Character
: SingleCharacter
| SimpleEscapeSequence
| HexadecimalEscapeSequence
| UnicodeEscapeSequence
;
UnicodeEscapeSequence
: '\\' 'u' HexDigit HexDigit HexDigit HexDigit
| '\\' 'U' HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit HexDigit
;
SingleCharacter
: ~[\\\\\\\u000D\u000A\u0085\u2028\u2029]
;
SimpleEscapeSequence
: '\\\''
| '\\"'
| '\\\\'
| '\\0'
| '\\a'
| '\\b'
| '\\f'
| '\\n'
| '\\r'
| '\\t'
| '\\v'
;
HexadecimalEscapeSequence
: '\\x' HexDigit HexDigit? HexDigit? HexDigit?
;
StringLiteral
: RegularStringLiteral
| VerbatimStringLiteral
;
RegularStringLiteral
: '"' RegularStringLiteralCharacters? '"'
;
RegularStringLiteralCharacters
: RegularStringLiteralCharacter+
;
RegularStringLiteralCharacter
: SingleRegularStringLiteralCharacter
| SimpleEscapeSequence
| HexadecimalEscapeSequence
| UnicodeEscapeSequence
;
SingleRegularStringLiteralCharacter
: ~["\\\u000D\u000A\u0085\u2028\u2029]
;
VerbatimStringLiteral
: '@"' VerbatimStringLiteralCharacters? '"'
;
VerbatimStringLiteralCharacters
: VerbatimStringLiteralCharacter+
;
VerbatimStringLiteralCharacter
: SingleVerbatimStringLiteralCharacter
| QuoteEscapeSequence
;
SingleVerbatimStringLiteralCharacter
: ~["]
;
QuoteEscapeSequence
: '""'
;
NullLiteral
: 'null'
;
/**********
*
* Whitespaces and comments
*
**********/
WS : [ \t\r\n]+ -> skip
;
COMMENT
: '/*' .*? '*/' -> skip
;
LINE_COMMENT
: '//' ~[\r\n]* -> skip
;
编辑: 好吧,我已经成功地将问题与这段代码隔离开来了:
grammar Test;
start : VerbatimStringLiteral EOF ;
VerbatimStringLiteral
: '@"' VerbatimStringLiteralCharacter* '"'
;
VerbatimStringLiteralCharacter
: SingleVerbatimStringLiteralCharacter
| QuoteEscapeSequence
;
SingleVerbatimStringLiteralCharacter
: ~["]
;
QuoteEscapeSequence
: '""'
;
WS : [ \t\r\n]+ -> skip
;
答案 0 :(得分:1)
不会自行生成令牌的Lexer规则应使用fragment
修饰符进行标记。例如,QuoteEscapeSequence
不是独立令牌;它只是VerbatimStringLiteral
令牌的一部分,因此您应该使用fragment
标记它。以下是一些应该是fragment
规则的其他规则:
VerbatimStringLiteralCharacter
SingleVerbatimStringLiteralCharacter
SingleRegularStringLiteralCharacter
RegularStringLiteralCharacter
RegularStringLiteralCharacters
←这个是您输入此特定输入的错误的来源SimpleEscapeSequence
可能会有更多,但这应该让你知道问题是什么以及如何解决它。