SPARK SQL相关标量子查询不适用于BETWEEN

时间:2019-07-09 10:52:23

标签: sql apache-spark between

我正在尝试查询几个表。而且我需要有一个以BETWEEN作为运算符的子查询。该查询正常运行,因为我不使用“ between”或“ in”或“> =”等。

我认为这是因为Spark SQL无法在没有“等于(=)”运算符的情况下聚合。

如果我删除了此“ AND c1.month BETWEEN ...”查询,那么,如果我编写了AND c1.month BETWEEN 1 AND 3(例如),

在此先感谢您的帮助

这是代码:

    SELECT (cr.volume  /
            (SELECT SUM(hours)
             FROM TempsuCalendar c1
             WHERE YEAR(cr.deliveryfromdate) = c1.year AND
                   c1.month BETWEEN MONTH(cr.deliveryfromdate) AND 
    MONTH(cr.deliverytodate) AND
                   c1.region  = cr.NewRegion AND
                   c1.loadtype = 'B'
            )
           ) as AVERAGEPOWER,
           T.MODIFIEDBY,
           T.MODIFICATIONDATE
    FROM TempenmTransactionmapping tm JOIN --DWH.EnmTransactionMapping TM
         TempsuTransactionmapping  sm      --DWH.SuTransactionMapping SM
         ON tm.id_sutransactionmapping = sm.id JOIN
         TempsuTransaction t               --DWH.SuTransaction T
         ON sm.id_transaction = t.id_transaction JOIN
         TempsuCalculationrow cr   --DWH.SuCalculationRow CR
         ON t.id_calculationrow = cr.id_calculationr

错误消息是:

Error in SQL statement: AnalysisException: Correlated column is not allowed in a non-equality predicate:
Aggregate [sum(cast(hours#15695 as bigint)) AS sum(hours)#17948L]
+- Filter (((year(cast(outer(deliveryfromdate#15543) as date)) = year#15690) && ((month#15691 >= month(cast(outer(deliveryfromdate#15543) as date))) && (month#15691 <= month(cast(outer(deliverytodate#15544) as date))))) && ((region#15687 = outer(NewRegion#15829)) && (loadtype#15689 = B)))
   +- SubqueryAlias `c1`
      +- SubqueryAlias `tempsucalendar`
         +- Relation[REGION#15687,CALENDARDATE#15688,LOADTYPE#15689,YEAR#15690,MONTH#15691,WEEK#15692,DAYINMONTH#15693,WEEKDAY#15694,HOURS#15695,HOLIDAY#15696,HistoryDate#15697,ProcessingDate#15698,RankNo#15699] csv
;;
Distinct
+- Union... AND MORE

0 个答案:

没有答案