我正在Amazon Redshift
上编写一个简单的查询,如下所示:
SELECT EXTRACT(year FROM created_at) AS year,
EXTRACT(month FROM created_at) AS month,
member_id,
COUNT(*) as pageviews
FROM TABLE
GROUP BY year,
month,
member_id
ORDER BY year,
month,
member_id
这给出了以下结果作为示例:
year month member_id pageviews
2015 1 100 29
2015 2 100 22
2015 3 100 178
2015 4 100 34
2015 1 200 56
2015 3 200 16
这是我想要的结果:
year month member_id pageviews
2015 1 100 29
2015 2 100 22
2015 3 100 178
2015 4 100 34
2015 1 200 56
2015 2 200 0
2015 3 200 16
2015 4 200 0
在上面的结果中,请注意附加的浏览量为零的行。
如何获得此结果?任何帮助将不胜感激。
答案 0 :(得分:2)
使用cross join
生成行,然后使用left join
引入数据:
SELECT EXTRACT(year FROM created_at) AS year,
EXTRACT(month FROM created_at) AS month,
m.member_id,
COUNT(t.member_id) as pageviews
FROM (SELECT DISTINCT EXTRACT(year FROM created_at) AS year, EXTRACT(month FROM created_at) AS month FROM TABLE) ym CROSS JOIN
(SELECT DISTINCT member_id FROM TABLE) m LEFT JOIN
TABLE t
ON EXTRACT(year FROM created_at) AS month = ym.year AND
EXTRACT(month FROM created_at) AS month = ym.month AND
t.member_id = m.member_id
GROUP BY ym.year, ym.month, m.member_id
ORDER BY ym.year, ym.month, m.member_id;
这假定表中包含所有年/月组合。
如果您还有其他一些可以作为成员和日期来源的表格,请尝试使用它们-可能比SELECT DISTINCT
更快。