非聚合不存在时聚合列的问题

时间:2016-01-15 15:44:28

标签: sql amazon-web-services amazon-redshift

运行以下查询时,我遇到了Amazon Redshift聚合错误的问题:

select case when frequency between (avg(frequency) + stddev(frequency)) and (avg(frequency) - stddev(frequency)) then  round(avg(frequency) - stddev(frequency))||'-'||round(avg(frequency) + stddev(frequency))
       when frequency between (avg(frequency) + 2*stddev(frequency)) and (avg(frequency) - 2*stddev(frequency)) then  round(avg(frequency) - 2*stddev(frequency))||'-'||round(avg(frequency) + 2*stddev(frequency))
       when frequency between (avg(frequency) + 3*stddev(frequency)) and (avg(frequency) - 3*stddev(frequency)) then  round(avg(frequency) - 3*stddev(frequency))||'-'||round(avg(frequency) + 3*stddev(frequency))
          else null
           end as deviation 
from schema.table

错误告诉我,我需要在group by子句中包含频率。如果我这样做,那么我会收到“群组中不允许的聚合”。有谁知道为什么会这样吗?我最初的猜测是它可能是数据类型的问题,但是弄乱这个并没有帮助。

谢谢!

2 个答案:

答案 0 :(得分:0)

这些查询可能会令人困惑,您可以在子查询中单独获取聚合,然后通过交叉连接在每一行上使用它们,或者您可以使用分析函数,这样您就可以获得聚合值而无需GROUP BY

SELECT case when frequency between (avg_Freq + dev_Freq) and (avg_Freq - dev_Freq) then  round(avg_Freq - dev_Freq)||'-'||round(avg_Freq + dev_Freq)
       when frequency between (avg_Freq + 2*dev_Freq) and (avg_Freq - 2*dev_Freq) then  round(avg_Freq - 2*dev_Freq)||'-'||round(avg_Freq + 2*dev_Freq)
       when frequency between (avg_Freq + 3*dev_Freq) and (avg_Freq - 3*dev_Freq) then  round(avg_Freq - 3*dev_Freq)||'-'||round(avg_Freq + 3*dev_Freq)
          else null
           end as deviation 
FROM schema.table
CROSS JOIN (SELECT avg(frequency) AS avg_Freq
            ,stddev(frequency) AS dev_Freq
      FROM schema.table
      )sub

或者,您可以将OVER()添加到现有查询中的每个聚合中:

select case when frequency between (avg(frequency) OVER() + stddev(frequency) OVER()) and (avg(frequency) OVER() - stddev(frequency) OVER()) then  round(avg(frequency) OVER() - stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + stddev(frequency) OVER())
       when frequency between (avg(frequency) OVER() + 2*stddev(frequency) OVER()) and (avg(frequency) OVER() - 2*stddev(frequency) OVER()) then  round(avg(frequency) OVER() - 2*stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + 2*stddev(frequency) OVER())
       when frequency between (avg(frequency) OVER() + 3*stddev(frequency) OVER()) and (avg(frequency) OVER() - 3*stddev(frequency) OVER()) then  round(avg(frequency) OVER() - 3*stddev(frequency) OVER())||'-'||round(avg(frequency) OVER() + 3*stddev(frequency) OVER())
          else null
           end as deviation 
from schema.table

对于redshift语法不是100%,但相信两者都应该有用。

答案 1 :(得分:0)

您可以通过以下方式将其分解为:

WITH
SELECT avg(frequency) as AVG, stddev(frequency) as STDDEV 
  from schema.table AS TEMP
,
SELECT case when frequency between TEMP.AVG and TEMP.STDDEV etc.

您必须检查确切的陈述。我是靠头做的。