我正在使用 ANTLR4 为 PL / SQL 代码生成抽象语法树(AST)。对于某些查询,它工作正常,但对于某些代码,它生成一个只有一个节点的AST,这是不对的。
例如:
DECLARE
a RAW; -- migrate to BLOB
b LONG RAW; -- migrate to BLOB
c LONG VARCHAR; -- migrate to LOB
d LONG; -- migrate to LOB
x VARCHAR;
CURSOR mycur RETURN LONG; -- should flag
FUNCTION myfunc RETURN LONG RAW -- should flag
IS
z LONG RAW; -- should flag
BEGIN
RETURN z;
END;
BEGIN
SELECT mycol, CAST(col2 AS RAW) -- should flag
INTO a
FROM mytab
WHERE (b IS OF TYPE(LONG RAW, RAW, VARCHAR)); -- should flag
END;
CREATE TABLE tab (
a RAW, -- should flag
b LONG RAW, -- should flag
c LONG VARCHAR, -- should flag
d LONG, -- should flag
x VARCHAR
);
对于此代码,它会生成此AST:
(compilation_unit DECLARE a RAW ; b LONG RAW ; c LONG VARCHAR ; d LONG ; x VARCHAR ; CURSOR mycur RETURN LONG ; FUNCTION myfunc RETURN LONG RAW IS z LONG RAW ; BEGIN RETURN z ; END ; BEGIN SELECT mycol , CAST ( col2 AS RAW ) INTO a FROM mytab WHERE ( b IS OF TYPE ( LONG RAW , RAW , VARCHAR ) ) ; END ; CREATE TABLE tab ( a RAW , b LONG RAW , c LONG VARCHAR , d LONG , x VARCHAR ) ;)
这只是compilation_unit节点中的给定代码。
但是如果我在没有声明部分的情况下给出相同的代码,它会提供良好的AST。
对于此代码:
FUNCTION myfunc RETURN LONG RAW -- should flag
IS
z LONG RAW; -- should flag
BEGIN
RETURN z;
END;
BEGIN
SELECT mycol, CAST(col2 AS RAW) -- should flag
INTO a
FROM mytab
WHERE (b IS OF TYPE(LONG RAW, RAW, VARCHAR)); -- should flag
END;
CREATE TABLE tab (
a RAW, -- should flag
b LONG RAW, -- should flag
c LONG VARCHAR, -- should flag
d LONG, -- should flag
x VARCHAR
);
它给出了这个AST:
(compilation_unit (unit_statement (create_function_body FUNCTION (function_name (id (id_expression (regular_id myfunc)))) RETURN (type_spec (datatype (native_datatype_element LONG RAW))) IS (declare_spec (variable_declaration (variable_name (id_expression (regular_id z))) (type_spec (datatype (native_datatype_element LONG RAW))) ;)) (body BEGIN (seq_of_statements (statement (return_statement RETURN (condition (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id z))))))))))))))))))))) ;) END) ;)) BEGIN (unit_statement (data_manipulation_language_statements (select_statement (subquery (subquery_basic_elements (query_block SELECT (selected_element (select_list_elements (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id mycol)))))))))))))))))))) , (selected_element (select_list_elements (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (standard_function CAST ( (concatenation_wrapper (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id col2)))))))))))) AS (type_spec (datatype (native_datatype_element RAW))) ))))))))))))))))) (into_clause INTO (variable_name (id_expression (regular_id a)))) (from_clause FROM (table_ref_list (table_ref (table_ref_aux (dml_table_expression_clause (tableview_name (id (id_expression (regular_id mytab))))))))) (where_clause WHERE (condition_wrapper (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom ( (expression_or_vector (expression (logical_and_expression (negated_expression (equality_expression (multiset_expression (relational_expression (compound_expression (concatenation (additive_expression (multiply_expression (datetime_expression (model_expression (unary_expression (atom (general_element (general_element_part (id_expression (regular_id b)))))))))))))) IS OF TYPE ( (type_spec (datatype (native_datatype_element LONG RAW))) , (type_spec (datatype (native_datatype_element RAW))) , (type_spec (datatype (native_datatype_element VARCHAR))) )))))) ))))) ; END ;)))))))))))))))))) unit_statement (unit_statement CREATE TABLE tab) (unit_statement (data_manipulation_language_statements (select_statement (subquery (subquery_basic_elements ( (subquery (subquery_basic_elements a RAW , b LONG RAW , c LONG VARCHAR , d LONG , x VARCHAR)) )) ;)))) <EOF>)
因此,似乎在 PL / SQL语法文件中缺少一些像declare这样的关键字(规则)。我在Antlr站点的Antlr中使用了这个plsql.g4文件。
有什么方法可以找到更新的 plsql.g4文件,还是我需要自己添加这些规则?
答案 0 :(得分:1)
问题似乎实际上是 plsql antlr-grammar 不支持 LONG VARCHAR 数据类型,在这一部分:“c LONG VARCHAR;”
这会导致声明被误解为过程调用弄乱了解析的其余部分。
我在语法github中打开了一个问题:https://github.com/antlr/grammars-v4/issues/2158