我正在运行代码:https://github.com/bkiers/antlr4-csv-demo。 我想通过添加以下行来查看词法分析器分析的标记:
System.out.println("Number of tokens: " + tokens.getTokens().size())
到Main.java:
public static void main(String[] args) throws Exception {
// the input source
String source =
"aaa,bbb,ccc" + "\n" +
"\"d,\"\"d\",eee,fff";
// create an instance of the lexer
CSVLexer lexer = new CSVLexer(new ANTLRInputStream(source));
// wrap a token-stream around the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
// look at tokens analyzed
System.out.println("Number of tokens: " + tokens.getTokens().size())
// create the parser
CSVParser parser = new CSVParser(tokens);
// invoke the entry point of our grammar
List<List<String>> data = parser.file().data;
// display the contents of the CSV source
for(int r = 0; r < data.size(); r++) {
List<String> row = data.get(r);
for(int c = 0; c < row.size(); c++) {
System.out.println("(row=" + (r+1) + ",col=" + (c+1) + ") = " + row.get(c));
}
}
}
打印出来的结果是:Number of tokens: 0
。为什么getTokens()
返回的列表为空?其余的解析器代码完全返回数据。
编辑:因此使用lexer.getAllTokens()
会有效,但为什么CommonTokenStream
没有返回正确的令牌?
csv.g4:
grammar CSV;
@header {
package csv;
}
file returns [List<List<String>> data]
@init {$data = new ArrayList<List<String>>();}
: (row {$data.add($row.list);})+ EOF
;
row returns [List<String> list]
@init {$list = new ArrayList<String>();}
: a=value {$list.add($a.val);} (Comma b=value {$list.add($b.val);})* (LineBreak | EOF)
;
value returns [String val]
: SimpleValue {$val = $SimpleValue.text;}
| QuotedValue
{
$val = $QuotedValue.text;
$val = $val.substring(1, $val.length()-1); // remove leading- and trailing quotes
$val = $val.replace("\"\"", "\""); // replace all `""` with `"`
}
;
Comma
: ','
;
LineBreak
: '\r'? '\n'
| '\r'
;
SimpleValue
: ~[,\r\n"]+
;
QuotedValue
: '"' ('""' | ~'"')* '"'
;
答案 0 :(得分:6)
通常,Parser负责启动输入流的lexing。要手动启动lexing,请调用CommonTokenStream.fill()(在BufferedTokenStream中实现)。