Antlr4:输入不匹配

时间:2013-05-03 16:54:53

标签: antlr4

这是一个简单的语法测试,我认为很容易解析,但是我得到了'不匹配的输入',我无法弄清楚Antlr正在寻找什么。

输入:

  # include "something" program TEST1 { BLAH BLAH }

我的语法:

  grammar ProgHeader;

  program: header* prog EOF ;
  header: '#' ( include | define ) ;
  include: 'include' string ;
  define: 'define' string string? ;
  string: '"' QTEXT '"' ;
  prog: 'program' QTEXT '{' BLOCK '}' ;
  QTEXT: ~[\r\n\"]+ ;
  BLOCK: ~[}]+ ; // don't care, example block
  WS: [ \t\r\n] -> skip ;

输出错误消息:

line 1:0 mismatched input '# include "something" program TEST1 { BLAH BLAH '
expecting {'program', '#'}

这真让我困惑,因为它说它正在寻找一个'#',并且在输入开始时就有一个正确。我也抛弃了解析树。根据“计划”规则,它似乎被困在顶部:

(program # include "something" program TEST1 { BLAH BLAH  } )

HALP?

如果重要的话,这是推动这个测试用例的完整程序(我不认为它应该重要,上面的信息就足够了,但现在就是这样):

  package antlrtests;

  import antlrtests.grammars.*;
  import org.antlr.v4.runtime.*;
  import org.antlr.v4.runtime.tree.*;

  /**
   *
   * @author Brenden Towey
   */
  public class ProgHeaderTest {
     private String[] testVectors = {
        "# include \"something\" program TEST1 { BLAH BLAH } ",
     };
     public void runTests() {
        for( String test : testVectors )
           simpleTest( test );
     }
     private void simpleTest( String test ) {
        ANTLRInputStream ains = new ANTLRInputStream( test );
        ProgHeaderLexer wpl = new ProgHeaderLexer( ains );
        CommonTokenStream tokens = new CommonTokenStream( wpl );
        ProgHeaderParser wikiParser = new ProgHeaderParser( tokens );
        ParseTree parseTree = wikiParser.program();
        System.out.println( "'" + test + "': " + parseTree.toStringTree(
                wikiParser ) );
     }
  }

完整输出:

run:
line 1:0 mismatched input '# include "something" program TEST1 { BLAH BLAH ' expecting {'program', '#'}
'# include "something" program TEST1 { BLAH BLAH } ': (program # include "something" program TEST1 { BLAH BLAH  } )
BUILD SUCCESSFUL (total time: 0 seconds)

1 个答案:

答案 0 :(得分:2)

最开始匹配的最长令牌是QTEXT,它匹配文本# include(文本到但不包括第一个"字符),但此时的有效令牌是'正如报道的那样,节目'和'#'。因此,最好避免几乎与任何内容匹配的令牌定义。