目标
我正在开发一个为Coldfusion CFscript创建Varscoper的项目。基本上,这意味着检查源代码文件以确保开发人员正确地var
了他们的变量。
使用ANTLR V4几天后,我有一个语法,在GUI视图中生成一个非常好的解析树。现在,使用该树,我需要一种方法来以编程方式在节点上爬行和寻找变量声明,并确保如果它们在函数内部,则它们具有适当的范围。如果可能的话,我宁愿不在语法文件中这样做,因为这需要将语言的定义与此特定任务混合。
我尝试了什么
我最近的尝试是使用ParserRuleContext
并尝试通过children
审核getPayload()
。检查getPayLoad()
的类后,我会有ParserRuleContext
个对象或Token
个对象。不幸的是,使用它我永远无法找到获取特定节点的实际规则类型的方法,只有它包含文本。每个节点的规则类型都是必需的,因为该文本节点是否是被忽略的右手表达式,变量赋值或函数声明都很重要。
问题
这是我的示例java代码:
Cfscript.java
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.Trees;
public class Cfscript {
public static void main(String[] args) throws Exception {
ANTLRInputStream input = new ANTLRFileStream(args[0]);
CfscriptLexer lexer = new CfscriptLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
CfscriptParser parser = new CfscriptParser(tokens);
parser.setBuildParseTree(true);
ParserRuleContext tree = parser.component();
tree.inspect(parser); // show in gui
/*
Recursively go though tree finding function declarations and ensuring all variableDeclarations are varred
but how?
*/
}
}
Cfscript.g4
grammar Cfscript;
component
: 'component' keyValue* '{' componentBody '}'
;
componentBody
: (componentElement)*
;
componentElement
: statement
| functionDeclaration
;
functionDeclaration
: Identifier? Identifier? 'function' Identifier argumentsDefinition '{' functionBody '}'
;
argumentsDefinition
: '(' argumentDefinition (',' argumentDefinition)* ')'
| '()'
;
argumentDefinition
: Identifier? Identifier? argumentName ('=' expression)?
;
argumentName
: Identifier
;
functionBody
: (statement)*
;
statement
: variableStatement
| nonVarVariableStatement
| expressionStatement
;
variableStatement
: 'var' variableName '=' expression ';'
;
nonVarVariableStatement
: variableName '=' expression ';'
;
expressionStatement
: expression ';'
;
expression
: assignmentExpression
| arrayLiteral
| objectLiteral
| StringLiteral
| incrementExpression
| decrementExpression
| 'true'
| 'false'
| Identifier
;
incrementExpression
: variableName '++'
;
decrementExpression
: variableName '--'
;
assignmentExpression
: Identifier (assignmentExpressionSuffix)*
| assignmentExpression (('+'|'-'|'/'|'*') assignmentExpression)+
;
assignmentExpressionSuffix
: '.' assignmentExpression
| ArrayIndex
| ('()' | '(' expression (',' expression)* ')' )
;
methodCall
: Identifier ('()' | '(' expression (',' expression)* ')' )
;
variableName
: Identifier (variableSuffix)*
;
variableSuffix
: ArrayIndex
| '.' variableName
;
arrayLiteral
: '[' expression (',' expression)* ']'
;
objectLiteral
: '{' (Identifier '=' expression (',' Identifier '=' expression)*)? '}'
;
keyValue
: Identifier '=' StringLiteral
;
StringLiteral
: '"' (~('\\'|'"'))* '"'
;
ArrayIndex
: '[' [1-9] [0-9]* ']'
| '[' StringLiteral ']'
;
Identifier
: [a-zA-Z0-9]+
;
WS
: [ \t\r\n]+ -> skip
;
COMMENT
: '/*' .*? '*/' -> skip
;
Test.cfc(测试代码文件)
component something = "foo" another = "more" persistent = "true" datasource = "#application.env.dsn#" {
var method = something.foo.test1;
testing = something.foo[10];
testingagain = something.foo["this is a test"];
nuts["testing"]++;
blah.test().test3["test"]();
var math = 1 + 2 - blah.test().test4["test"];
var test = something;
var testing = somethingelse;
var testing = {
test = more,
mystuff = {
interior = test
},
third = "third key"
};
other = "Idunno homie";
methodCall(interiorMethod());
public function bar() {
var new = "somebody i used to know";
something = [1, 2, 3];
}
function nuts(required string test1 = "first", string test = "second", test3 = "third") {
}
private boolean function baz() {
var this = "something else";
}
}
答案 0 :(得分:38)
如果我是你,我不会手动走这个。在生成词法分析器和解析器之后,ANTLR还会生成一个名为CfscriptBaseListener
的文件,该文件具有适用于所有解析器规则的空方法。您可以让ANTLR遍历您的树并附加一个自定义树监听器,您只能覆盖您感兴趣的那些方法/规则。
在您的情况下,您可能希望在创建新函数时通知(创建新范围),并且您可能对变量赋值(variableStatement
和nonVarVariableStatement
)感兴趣。当你在ANTLR走树时,你的调用者VarListener
将跟踪所有范围。
我确实略微更改了1条规则(我添加了objectLiteralEntry
):
objectLiteral : '{' (objectLiteralEntry (',' objectLiteralEntry)*)? '}' ; objectLiteralEntry : Identifier '=' expression ;
在以下演示中使生活更轻松:
public class VarListener extends CfscriptBaseListener {
private Stack<Scope> scopes;
public VarListener() {
scopes = new Stack<Scope>();
scopes.push(new Scope(null));
}
@Override
public void enterVariableStatement(CfscriptParser.VariableStatementContext ctx) {
String varName = ctx.variableName().getText();
Scope scope = scopes.peek();
scope.add(varName);
}
@Override
public void enterNonVarVariableStatement(CfscriptParser.NonVarVariableStatementContext ctx) {
String varName = ctx.variableName().getText();
checkVarName(varName);
}
@Override
public void enterObjectLiteralEntry(CfscriptParser.ObjectLiteralEntryContext ctx) {
String varName = ctx.Identifier().getText();
checkVarName(varName);
}
@Override
public void enterFunctionDeclaration(CfscriptParser.FunctionDeclarationContext ctx) {
scopes.push(new Scope(scopes.peek()));
}
@Override
public void exitFunctionDeclaration(CfscriptParser.FunctionDeclarationContext ctx) {
scopes.pop();
}
private void checkVarName(String varName) {
Scope scope = scopes.peek();
if(scope.inScope(varName)) {
System.out.println("OK : " + varName);
}
else {
System.out.println("Oops : " + varName);
}
}
}
Scope
对象可以简单如下:
class Scope extends HashSet<String> {
final Scope parent;
public Scope(Scope parent) {
this.parent = parent;
}
boolean inScope(String varName) {
if(super.contains(varName)) {
return true;
}
return parent == null ? false : parent.inScope(varName);
}
}
现在,为了测试这一切,这里有一个小主要类:
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;
public class Main {
public static void main(String[] args) throws Exception {
CfscriptLexer lexer = new CfscriptLexer(new ANTLRFileStream("Test.cfc"));
CfscriptParser parser = new CfscriptParser(new CommonTokenStream(lexer));
ParseTree tree = parser.component();
ParseTreeWalker.DEFAULT.walk(new VarListener(), tree);
}
}
如果您运行此Main
课程,将打印以下内容:
Oops : testing Oops : testingagain OK : test Oops : mystuff Oops : interior Oops : third Oops : other Oops : something
毫无疑问,这并不是你想要的,我可能会讨论一些Coldfusion的范围规则。但我认为这将为您提供一些如何正确解决问题的见解。我认为代码是非常自我解释的,但如果不是这样,请不要犹豫要求澄清。
HTH