对Teradata选举的非聚合值进行分区时具有分析功能的聚合函数必须是关联组的一部分

时间:2019-01-04 09:29:41

标签: sql teradata partition

我必须计算状态更改的次数,但前提是从一种状态到另一种状态的时间差小于30分钟。在我的数据库中,我有当前时间和上一个时间列,这是我使用过度分区制作的。这是我的查询,但出现错误: “所选的非汇总值必须是关联组的一部分”。 有人可以帮忙吗?

Chrome

here

2 个答案:

答案 0 :(得分:1)

分析功能在汇总后处理(where-from-group by-having-olap-qualify-order by),因此您无法将汇总应用于OVER的结果,必须将其嵌套在派生表中或通用表表达式:

SELECT
   Sum(
       CASE WHEN (status_previous='Test1' AND status_current='Test2')
              OR (status_previous='Test3' AND status_current='Test2')
              OR (status_previous='Test4' AND status_current='Test2')
            THEN 1                       
            ELSE 0
       END) AS "Total_Change"
FROM
 (
   SELECT col1, col2,
       Max(creation_dt_utc)
       Over(PARTITION BY col1,col2,col3
            ORDER BY creation_dt
            ROWS BETWEEN 1 Preceding AND 1 Preceding) AS previous_creation_dt,

       (creation_dt - prev_creation_dt) DAY(4) TO SECOND(6) AS time_difference,

       Extract(DAY From time_difference) * 24*60 + Extract(HOUR From time_difference) * 60 + Extract(MINUTE From time_difference) AS Total_Minutes

   FROM myTable
   WHERE Extract(YEAR From year_column)=2017 -- the result of EXTRACT is an INTEGER, not a string
   QUALIFY Total_Minutes<30
 ) AS dt

但是,由于您只想计数,因此可以将CASE移至QUALIFY:

SELECT Count(*) AS "Total_Change"
FROM
 (
   SELECT col1, col2,
       Max(creation_dt_utc)
       Over(PARTITION BY col1,col2,col3
            ORDER BY creation_dt
            ROWS BETWEEN 1 Preceding AND 1 Preceding) AS previous_creation_dt,

       (creation_dt - prev_creation_dt) DAY(4) TO SECOND(6) AS time_difference,

       Extract(DAY From time_difference) * 24*60 + Extract(HOUR From time_difference) * 60 + Extract(MINUTE From time_difference) AS Total_Minutes

   FROM myTable
   WHERE Extract(YEAR From year_column)=2017 -- the result of EXTRACT is an INTEGER, not a string
   QUALIFY Total_Minutes<30
       AND (   (status_previous='Test1' AND status_current='Test2')
            OR (status_previous='Test3' AND status_current='Test2')
            OR (status_previous='Test4' AND status_current='Test2')
           )
 ) AS dt

编辑:

CASE逻辑可以进一步简化为:

CASE WHEN status_current='Test2' and status_previous IN ('Test1','Test3','Test4')
     THEN 1                       
     ELSE 0
END

或者也许

CASE WHEN status_current='Test2' and status_previous <>'Test2'
     THEN 1                       
     ELSE 0
END

答案 1 :(得分:0)

我认为QUALIFY应该在WHERE子句之后。

对于以前的值,我认为LAG比MAX更合适。

那些嵌套的CASE可以写为1 CASE。 因为一旦满足WHEN条件,它将不检查其后的其他WHEN条件。

由于使用了正常的SUM,因此应该有GROUP BY。

SELECT col1, col2,
 COUNT(*) AS Total,
 SUM(TimeDiffMinutes) AS Total_Minutes,
 SUM(CASE WHEN StatusChanged = 1 THEN TimeDiffMinutes ELSE 0 END) AS Total_Minutes_Change,
 COUNT(CASE WHEN StatusChanged = 1 THEN 1 END) AS Total_Change
FROM
(
  SELECT col1, col2, col3, creation_dt,
  (CASE 
   WHEN status_previous='Test1' and status_current='Test2' THEN 1
   WHEN status_previous='Test3' and status_current='Test2' THEN 1   
   WHEN status_previous='Test4' and status_current='Test2' THEN 1
   ELSE 0
   END) AS StatusChanged,
  LAG(creation_dt) OVER (PARTITION BY col1, col2, col3 ORDER BY creation_dt) AS prev_creation_dt,
  (creation_dt - prev_creation_dt) DAY(4) TO SECOND(6) AS time_difference,
  EXTRACT(DAY FROM time_difference)*(24*60) + EXTRACT(HOUR FROM time_difference)*60 + EXTRACT(MINUTE FROM time_difference) AS TimeDiffMinutes
  FROM myTable  
  WHERE EXTRACT(YEAR from year_column) = '2017'
  QUALIFY (creation_dt - prev_creation_dt) day(4) to second(6) < interval '30' minute
) q
GROUP BY col1, col2
ORDER BY col1, col2