ANTLR4 Lexer getTokens()返回0个令牌

时间:2015-06-05 16:08:22

标签: java antlr antlr4

我正在运行代码:https://github.com/bkiers/antlr4-csv-demo。 我想通过添加以下行来查看词法分析器分析的标记:

System.out.println("Number of tokens: " + tokens.getTokens().size())

到Main.java:

public static void main(String[] args) throws Exception {  
    // the input source  
    String source =   
        "aaa,bbb,ccc" + "\n" +   
        "\"d,\"\"d\",eee,fff";  

    // create an instance of the lexer  
    CSVLexer lexer = new CSVLexer(new ANTLRInputStream(source));  

    // wrap a token-stream around the lexer  
    CommonTokenStream tokens = new CommonTokenStream(lexer);  

    // look at tokens analyzed
    System.out.println("Number of tokens: " + tokens.getTokens().size())

    // create the parser  
    CSVParser parser = new CSVParser(tokens);  

    // invoke the entry point of our grammar  
    List<List<String>> data = parser.file().data;  

    // display the contents of the CSV source  
    for(int r = 0; r < data.size(); r++) {  
      List<String> row = data.get(r);  
      for(int c = 0; c < row.size(); c++) {  
        System.out.println("(row=" + (r+1) + ",col=" + (c+1) + ") = " + row.get(c));  
      }  
    }  
  }  

打印出来的结果是:Number of tokens: 0。为什么getTokens()返回的列表为空?其余的解析器代码完全返回数据。

编辑:因此使用lexer.getAllTokens()会有效,但为什么CommonTokenStream没有返回正确的令牌?

csv.g4:

grammar CSV;

@header {
  package csv;
}

file returns [List<List<String>> data]  
@init {$data = new ArrayList<List<String>>();}  
 : (row {$data.add($row.list);})+ EOF  
 ; 

row returns [List<String> list]  
@init {$list = new ArrayList<String>();}  
 : a=value {$list.add($a.val);} (Comma b=value {$list.add($b.val);})* (LineBreak | EOF)  
 ;

value returns [String val]  
 : SimpleValue {$val = $SimpleValue.text;}  
 | QuotedValue   
   { 
     $val = $QuotedValue.text; 
     $val = $val.substring(1, $val.length()-1); // remove leading- and trailing quotes 
     $val = $val.replace("\"\"", "\""); // replace all `""` with `"` 
   }  
 ;  

Comma  
 : ','  
 ;  

LineBreak  
 : '\r'? '\n'  
 | '\r'  
 ;  

SimpleValue  
 : ~[,\r\n"]+  
 ;  

QuotedValue  
 : '"' ('""' | ~'"')* '"'  
 ;  

1 个答案:

答案 0 :(得分:6)

通常,Parser负责启动输入流的lexing。要手动启动lexing,请调用CommonTokenStream.fill()(在BufferedTokenStream中实现)。