按月查询前n个月的数据

时间:2018-02-19 18:19:23

标签: sql amazon-redshift

我正在创建一个查询,该查询获取2列的数量和另一列的总和,按月从最近13个月的日期列开始分组。这是我的问题:

SELECT  TO_CHAR(colDate,'yyyy_MM') as month ,
        COUNT(DISTINCT col1) AS col1,
        COUNT(DISTINCT col2) as col2,
        SUM(col3) as col3 
FROM myTable
WHERE TO_CHAR(colDate,'yyyy_MM') IN (select distinct TO_CHAR(colDate,'yyyy_MM')
                                     from myTable
                                     order by  1 desc
                                     limit 13)
GROUP BY 1

问题在于,每个月,我还需要前3个月的平均值:

COUNT(DISTINCT col1)AS col1, COUNT(DISTINCT col2)为col2, SUM(col3)为col3

所以我的查询需要像:

SELECT  TO_CHAR(colDate,'yyyy_MM') as month ,
            COUNT(DISTINCT col1) AS col1,
            COUNT(DISTINCT col2) as col2,
            SUM(col3) as col3,
            ... as PreviousMonthsAvgCol1,
            ... as PreviousMonthsAvgCol2,
            ... as PreviousMonthsAvgCol3
    FROM myTable
    WHERE TO_CHAR(colDate,'yyyy_MM') IN (select distinct TO_CHAR(colDate,'yyyy_MM')
                                         from myTable
                                         order by  1 desc
                                         limit 13)
    GROUP BY 1

第一个月之前的几个月仍然需要计算在第一个月的平均值。

2 个答案:

答案 0 :(得分:1)

如果您在13个月之前不需要数据,请使用lag()

SELECT . . .,
       LAG(COUNT(DISTINCT col1)) OVER (ORDER BY MIN(colDate)) as prev_col1,
   . . . 
FROM myTable . . .;

如果您确实需要早期数据,请执行完整聚合,然后选择13个月。

答案 1 :(得分:0)

同意Gordon Lindoff的回答。

但是,我建议不要在日期范围谓词中使用TO_CHAR()。这将迫使Redshift扫描超出必要的数据。

如果必须将日期四舍五入到整个月,请尝试使用colDate BETWEEN '2017-01-01' and '2018-01-31'DATE_TRUNC()