我在ANTLRv4的顶部编写了一个演示代码,如下所示:
String expression = "var c = a + b()";
ExprLexer lexer = new ExprLexer(CharStreams.fromString(expression));
CommonTokenStream tokens = new CommonTokenStream(lexer);
ExprParser parser = new ExprParser(tokens);
lexer.removeErrorListeners();
parser.removeErrorListeners();
CountingErrorListener errorListener = new CountingErrorListener();
parser.addErrorListener(errorListener);
Vocabulary vocabulary = lexer.getVocabulary();
System.out.println("vocabulary : "+vocabulary.getDisplayName(4));
对于最后一行,符号名称ID
将显示在控制台上。 ID
在.g4文件中定义,例如
ID: [a-zA-Z] [a-zA-Z0-9_]*;
我的问题是,我可以通过某些类或方法在程序中获取原始ID模式[a-zA-Z] [a-zA-Z0-9_]*
吗?
答案 0 :(得分:0)
没有直接的方法可以做到这一点。您可以做的是使用ANTLR自己的解析器解析语法:https://github.com/antlr/grammars-v4/tree/master/antlr/antlr4
然后,您创建一个侦听器,该侦听器收集Map
中的所有词法分析器规则模式。
快速演示:
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.misc.Interval;
import org.antlr.v4.runtime.tree.ParseTreeWalker;
import java.util.LinkedHashMap;
import java.util.Map;
public class Main {
public static void main(String[] args) throws Exception {
String source = "grammar T;\n" +
"\n" +
"parse\n" +
" : ID+ EOF\n" +
" ;\n" +
"\n" +
"VAR\n" +
" : '$' ID\n" +
" ;\n" +
"\n" +
"ID\n" +
" : [a-zA-Z] [a-zA-Z0-9_]*\n" +
" ;";
ANTLRv4Lexer lexer = new ANTLRv4Lexer(CharStreams.fromString(source));
ANTLRv4Parser parser = new ANTLRv4Parser(new CommonTokenStream(lexer));
LexerRuleListener lexerRuleListener = new LexerRuleListener(lexer.getInputStream());
ParseTreeWalker.DEFAULT.walk(lexerRuleListener, parser.grammarSpec());
System.out.println("ID's pattern: " + lexerRuleListener.getPatternForToken("ID"));
}
}
class LexerRuleListener extends ANTLRv4ParserBaseListener {
private final Map<String, String> lexerRuleMap;
private final CharStream inputStream;
public LexerRuleListener(CharStream inputStream) {
this.lexerRuleMap = new LinkedHashMap<>();
this.inputStream = inputStream;
}
public String getPatternForToken(String tokenName) {
return this.lexerRuleMap.get(tokenName);
}
// lexerRuleSpec
// : DOC_COMMENT* FRAGMENT? TOKEN_REF COLON lexerRuleBlock SEMI
// ;
@Override
public void enterLexerRuleSpec(ANTLRv4Parser.LexerRuleSpecContext ctx) {
int startIndex = ctx.lexerRuleBlock().start.getStartIndex();
int stopIndex = ctx.lexerRuleBlock().stop.getStopIndex();
String text = inputStream.getText(new Interval(startIndex, stopIndex));
lexerRuleMap.put(ctx.TOKEN_REF().getText(), text);
}
}
运行上面的代码将打印:
ID's pattern: [a-zA-Z] [a-zA-Z0-9_]*