antlr4:语法模糊,左递归,两者兼而有之?

时间:2015-11-17 07:33:44

标签: antlr4

我的语法如下所示,无法编译。返回的错误(来自antlr4 maven插件)是:

[INFO] --- antlr4-maven-plugin:4.3:antlr4 (default-cli) @ beebell ---
[INFO] ANTLR 4: Processing source directory /Users/kodecharlie/workspace/beebell/src/main/antlr4
[INFO] Processing grammar: DateRange.g4
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP>
org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree node: RULE expecting <UP>
[ERROR] error(20):  internal error: Rule HOUR undefined 
[ERROR] error(20):  internal error: Rule MINUTE undefined 
[ERROR] error(20):  internal error: Rule SECOND undefined 
[ERROR] error(20):  internal error: Rule HOUR undefined 
[ERROR] error(20):  internal error: Rule MINUTE undefined 

我可以看到语法可能会如何混淆 - 例如,2位数是一分钟,二分之一还是一小时(或者可能是一年的开始)。但是一些文章认为这个错误是由左递归造成的。

你能说出发生了什么吗?

感谢。这是语法:

grammar DateRange;

range     : startDate (THRU endDate)? | 'Every' LONG_DAY 'from' startDate THRU endDate ;

startDate : dateTime ;
endDate   : dateTime ;
dateTime  : GMTOFF | SHRT_MDY | YYYYMMDD | (WEEK_DAY)? LONG_MDY ;

// Dates.
GMTOFF    : YYYYMMDD 'T' HOUR ':' MINUTE ':' SECOND ('-'|'+') HOUR ':' MINUTE ;
YYYYMMDD  : YEAR '-' MOY '-' DOM ;
SHRT_MDY  : MOY ('/' | '-') DOM ('/' | '-') YEAR ;
LONG_MDY  : (SHRT_MNTH '.'? | LONG_MNTH) WS DOM ','? (WS YEAR (','? WS TIMESPAN)? | WS startTime)? ;

YEAR      : DIGIT DIGIT DIGIT DIGIT ;   // year
MOY       : (DIGIT | DIGIT DIGIT) ;     // month of year.
DOM       : (DIGIT | DIGIT DIGIT) ;     // day of month.
TIMESPAN  : startTime (WS THRU WS endTime)? ;

// Time-of-day.
startTime : TOD ;
endTime   : TOD ;
TOD       : NOON | HOUR2 (':' MINUTE)? WS? MERIDIAN ;
NOON      : 'noon' ;
HOUR2     : (DIGIT | DIGIT DIGIT) ;
MERIDIAN  : 'AM' | 'am' | 'PM' | 'pm' ;

// 24-hour clock.  Sanity-check range in listener.
HOUR      : DIGIT DIGIT ;
MINUTE    : DIGIT DIGIT ;
SECOND    : DIGIT DIGIT ;

// Range verb.
THRU      : WS ('-'|'to') WS -> skip ;

// Weekdays.
WEEK_DAY  : (SHRT_DAY | LONG_DAY) ','? WS ;
SHRT_DAY  : 'Sun' | 'Mon' | 'Tue' | 'Wed' | 'Thu' | 'Fri' | 'Sat' -> skip ;
LONG_DAY  : 'Sunday' | 'Monday' | 'Tuesday' | 'Wednesday' | 'Thursday' | 'Friday' | 'Saturday' -> skip ;

// Months.
SHRT_MNTH : 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Aug' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ;
LONG_MNTH : 'January' | 'February' | 'March' | 'April' | 'May' | 'June' | 'July' | 'August' | 'September' | 'October' | 'November' | 'December' ;

DIGIT     : [0-9] ;
WS        : [ \t\r\n]+ -> skip ;

1 个答案:

答案 0 :(得分:0)

我通过为每个数字序列(长度为1,2,3或4)设置唯一的生产规则来解决此问题。同样,我简化了几条规则 - 实际上,试图使生产规则替代方案更加简单。无论如何,这是最终的结果,它编译:

grammar DateRange;

range     : 'Every' WS longDay WS 'from' WS startDate THRU endDate
          | startDate THRU endDate
          | startDate
          ;

startDate : dateTime ; endDate   : dateTime ; dateTime  : utc
          | shrtMdy
          | yyyymmdd
          | longMdy
          | weekDay ','? WS longMdy
          ;

// Dates.
utc       : yyyymmdd 'T' hour ':' minute ':' second ('-'|'+') hour ':' minute ;
yyyymmdd  : year '-' moy '-' dom ;
shrtMdy : moy ('/' | '-') dom ('/' | '-') year ;
longMdy   : longMonth WS dom ','? optYearAndOrTime?
          | shrtMonth '.'? WS dom ','? optYearAndOrTime?
          ;

optYearAndOrTime : WS year ','? WS timespan
                 | WS year
                 | WS timespan
                 ;

fragment DIGIT : [0-9] ;
ONE_DIGIT    : DIGIT ;
TWO_DIGITS   : DIGIT ONE_DIGIT ;
THREE_DIGITS : DIGIT TWO_DIGITS ;
FOUR_DIGITS  : DIGIT THREE_DIGITS ;

year      : FOUR_DIGITS ;                   // year
moy       : ONE_DIGIT | TWO_DIGITS ;        // month of year.
dom       : ONE_DIGIT | TWO_DIGITS ;        // day of month.
timespan  : (tod THRU tod) | tod ;

// Time-of-day.
tod       : noon | (hour2 (':' minute)? WS? meridian?) ;
noon      : 'noon' ; hour2     : ONE_DIGIT | TWO_DIGITS ;
meridian  : ('AM' | 'am' | 'PM' | 'pm' | 'a.m.' | 'p.m.') ;

// 24-hour clock.  Sanity-check range in listener.
hour      : TWO_DIGITS ;
minute    : TWO_DIGITS ;
second    : TWO_DIGITS ;   // we do not use seconds.

// Range verb.
THRU      : WS? ('-'|'–'|'to') WS? ;

// Weekdays.
weekDay   : shrtDay | longDay ; shrtDay   : 'Sun' | 'Mon' | 'Tue' | 'Wed' | 'Thu' | 'Fri' | 'Sat' ; longDay   : 'Sunday' | 'Monday' | 'Tuesday' | 'Wednesday' | 'Thursday' | 'Friday' | 'Saturday' ;

// Months.
shrtMonth : 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' | 'Jul' | 'Aug' | 'Sep' | 'Oct' | 'Nov' | 'Dec' ;
longMonth : 'January' | 'February' | 'March' | 'April' | 'May' | 'June' | 'July' | 'August' | 'September' | 'October' | 'November' | 'December' ;

WS        : ~[a-zA-Z0-9,.:]+ ;