令人困惑的SQL解析问题:WHERE子句解析在所有条件

时间:2017-12-31 20:12:47

标签: parsing antlr antlr4

我在Antlr 4中这样做,但我相信这与此无关。 我正在解析SQL SELECT语句的WHERE子句。解析器无法将各个条件子句组合在一起。 错误是:

line 4:44 mismatched input 'AND' expecting {'EXCEPT', 'FROM', GROUP, ...

这是输入声明:

SELECT col1, col2, col3
FROM T1
WHERE   (NS1.Func1(col1, 123) == false) AND 
        (String.IsNullOrEmpty(Func2(col2))) AND
        (NOT (NS2.Func3(col3) == "1")) ;

有问题的AND实际上是where子句中的第二个。第一个被正确解析。 我预计第二个AND也属于: expression_body(expression_tail)*规则。但由于某些原因,它没有。

我已经粘贴了令牌流和下面的树输出。 请注意,在树上"其中"在处理第二个AND之前,子句实际上已经结束。

语法的相关部分(如果你需要更多,请告诉我):

where_clause:
    WHERE expression
    ;

expression:
    expression_body (expression_tail)*
    ;

expression_tail:
    comparison_operator expression
    | comparison_operator (ALL | SOME | ANY) LPAREN select_query RPAREN
    | IN LPAREN expression ( COMMA expression )* RPAREN
    | IS (NOT)? NULL
    | NOT? LIKE expression ( ESCAPE expression )?
    | common_clause_token expression
    ;

expression_body:
    DEFAULT 
    | STAR
    | NULL 
    | func_call
    | BETWEEN expression AND expression
    | IF LPAREN expression COMMA expression COMMA expression RPAREN
    | windowing_func
    | EXISTS LPAREN select_query RPAREN
    | LPAREN expression RPAREN
    | identifier_up_to_two_parts
    | CHAR
    | NUMBER
    | STRING
    ;

func_call:
    UNQUOTEDIDENTIFIER ( DOT ( UNQUOTEDIDENTIFIER | func_call) )* LPAREN ( expression (COMMA expression)*)? RPAREN
    | CAST LPAREN expression AS data_type RPAREN
    | AGG LPAREN (DISTINCT)? expression RPAREN
    ;

common_clause_token:
    NUMBER 
    |   CHAR 
    |   UNQUOTEDIDENTIFIER
    |   PUNCTUATION
    |   KEYWORD
    |   AND
    |   OR 
    |   STAR
    |   DOT 
    |   LT
    |   EQ
    |   GT
    |   Q
    |   LBRKT
    |   RBRKT
    ; 

令牌列表:

[0,0:5,<SELECT> = 'SELECT']
[1,6:6,<WS> = ' ']
[2,7:10,<UNQUOTEDIDENTIFIER> = 'col1']
[3,11:11,<COMMA> = ',']
[4,12:12,<WS> = ' ']
[5,13:16,<UNQUOTEDIDENTIFIER> = 'col2']
[6,17:17,<COMMA> = ',']
[7,18:18,<WS> = ' ']
[8,19:22,<UNQUOTEDIDENTIFIER> = 'col3']
[9,23:24,<WS> = '
']
[10,25:28,<FROM> = 'FROM']
[11,29:29,<WS> = ' ']
[12,30:31,<UNQUOTEDIDENTIFIER> = 'T1']
[13,32:33,<WS> = '
']
[14,34:38,<WHERE> = 'WHERE']
[15,39:41,<WS> = '   ']
[16,42:42,<LPAREN> = '(']
[17,43:45,<UNQUOTEDIDENTIFIER> = 'NS1']
[18,46:46,<DOT> = '.']
[19,47:51,<UNQUOTEDIDENTIFIER> = 'Func1']
[20,52:52,<LPAREN> = '(']
[21,53:56,<UNQUOTEDIDENTIFIER> = 'col1']
[22,57:57,<COMMA> = ',']
[23,58:58,<WS> = ' ']
[24,59:61,<NUMBER> = '123']
[25,62:62,<RPAREN> = ')']
[26,63:63,<WS> = ' ']
[27,64:65,<PUNCTUATION> = '==']
[28,66:66,<WS> = ' ']
[29,67:71,<UNQUOTEDIDENTIFIER> = 'false']
[30,72:72,<RPAREN> = ')']
[31,73:73,<WS> = ' ']
[32,74:76,<AND> = 'AND']
[33,77:87,<WS> = ' 
        ']
[34,88:88,<LPAREN> = '(']
[35,89:94,<UNQUOTEDIDENTIFIER> = 'String']
[36,95:95,<DOT> = '.']
[37,96:108,<UNQUOTEDIDENTIFIER> = 'IsNullOrEmpty']
[38,109:109,<LPAREN> = '(']
[39,110:114,<UNQUOTEDIDENTIFIER> = 'Func2']
[40,115:115,<LPAREN> = '(']
[41,116:119,<UNQUOTEDIDENTIFIER> = 'col2']
[42,120:120,<RPAREN> = ')']
[43,121:121,<RPAREN> = ')']
[44,122:122,<RPAREN> = ')']
[45,123:123,<WS> = ' ']
[46,124:126,<AND> = 'AND']
[47,127:136,<WS> = '
        ']
[48,137:137,<LPAREN> = '(']
[49,138:140,<NOT> = 'NOT']
[50,141:141,<WS> = ' ']
[51,142:142,<LPAREN> = '(']
[52,143:145,<UNQUOTEDIDENTIFIER> = 'NS2']
[53,146:146,<DOT> = '.']
[54,147:151,<UNQUOTEDIDENTIFIER> = 'Func3']
[55,152:152,<LPAREN> = '(']
[56,153:156,<UNQUOTEDIDENTIFIER> = 'col3']
[57,157:157,<RPAREN> = ')']
[58,158:158,<WS> = ' ']
[59,159:160,<PUNCTUATION> = '==']
[60,161:161,<WS> = ' ']
[61,162:164,<STRING> = '"1"']
[62,165:165,<RPAREN> = ')']
[63,166:166,<RPAREN> = ')']
[64,167:167,<WS> = ' ']
[65,168:168,<SEMI> = ';']
[66,169:168,<EOF> = '<EOF>']

解析树:

(query 
(query_expression (query_primary (select_query SELECT opt_distinct opt_top 
(select_item_list (select_item_list (select_item_list (select_item 
(expression (expression_body (identifier_up_to_two_parts col1))) opt_alias)) , 
(select_item (expression (expression_body (identifier_up_to_two_parts col2))) opt_alias)) , 
(select_item (expression (expression_body (identifier_up_to_two_parts col3))) opt_alias)) 
(opt_select_query_clause_list (select_query_clause_list (select_query_clause_list (select_query_clause (select_from_clause 
(table_from_clause FROM (table_from_source (join_expression (aliased_table (input_rowset (identifier_up_to_two_parts T1)) opt_alias))))))) (select_query_clause 
(where_clause WHERE (expression (expression_body ( (expression (expression_body 
(func_call NS1 . Func1 ( (expression (expression_body (identifier_up_to_two_parts col1))) , (expression (expression_body 123)) ))) 
(expression_tail (common_clause_token ==) 
(expression (expression_body (identifier_up_to_two_parts false))))) )) 
(expression_tail (common_clause_token AND) 
(expression (expression_body ( 
(expression (expression_body (func_call String . IsNullOrEmpty ( 
(expression (expression_body (func_call Func2 ( 
(expression (expression_body (identifier_up_to_two_parts col2))) )))) )))) )))))
) // JS: End of "where" tree
)))))) 
AND ( NOT ( NS2 . Func3 ( col3 ) == "1" ) ) ;)

1 个答案:

答案 0 :(得分:1)

AND似乎不是问题 - 第一个AND实际上已正确解析:(expression_tail (common_clause_token AND)

但是,我认为没有接受NOT (<expression>)的定义。