我正在Postgres 9.3中的用户活动日志表上编写分析查询。它有一个注册日期,一个数据字段(可以求和)和一个用户类型。我为这个问题构建了一些示例数据/ sql,我希望得到一些帮助来搞清楚最后一部分。测试所需的SQL如下 - 它将删除/创建一个名为facts的表 - 所以一定要在沙盒中工作。
我按周和用户类型汇总数据 - 因此您每周都会获得每种用户类型的数据字段计数。我遇到的问题是我得到的结果是用户类型='x'缺少一周。由于用户类型“x”在第9-9-13周没有用户数据,因此不显示任何行(请参阅下面的示例结果)。我希望那个用户类型和周有一行。我想完成这个,如果可能的话,使用单个select语句,没有临时表或维度表(这是因为我将这个sql传递给业务经理,并且单个自包含的SQL select语句有望更加简单(批评这种方法是受欢迎的,但不是答案)。谢谢大家的帮助!
以下是我得到的结果:
Sum test_week user_type 4 "2013-09-02" "x" 5 "2013-09-02" "y" 10 "2013-09-09" "y" 2 "2013-09-16" "x" 1 "2013-09-16" "y"
这是我想要的结果:
Sum test_week user_type 4 "2013-09-02" "x" 5 "2013-09-02" "y" 0 "2013-09-09" "x" 10 "2013-09-09" "y" 2 "2013-09-16" "x" 1 "2013-09-16" "y"
这是测试数据和SQL select语句:
drop table if exists facts;
create temp table facts (signup_date date, data integer, record_type varchar, alt varchar);
insert into facts (signup_date, data, record_type) values
('9/3/2013',1,'x'),
('9/4/2013',1,'y'),
('9/5/2013',2,'x'),
('9/6/2013',3,'y'),
('9/7/2013',1,'x'),
('9/8/2013',1,'y'),
-- note the week of 9/9 to 9/16 has no 'x' records
('9/9/2013',2,'y'),
('9/10/2013', 3, 'y'),
('9/11/2013', 4, 'y'),
('9/12/2013', 1, 'y'),
('9/17/2013', 2, 'x'),
('9/18/2013', 1, 'y');
select coalesce(data, 0), test_week, record_type
from
(select sum(data) as data, record_type, to_timestamp(EXTRACT(YEAR FROM signup_date) || ' ' || EXTRACT(WEEK FROM signup_date),'IYYY IW')::date as test_week
from facts
group by record_type, test_week
) as facts
order by test_week, record_type
答案 0 :(得分:1)
要解决此问题,请创建所有record_type
和所有测试周的所有组合的列表。左边从这些组合连接到实际的事实表。这将提供所有记录,因此您应该能够获得没有数据的行:
select coalesce(sum(f.data), 0) as data, rt.record_type, w.test_week
from (select distinct record_type from facts) rt cross join
(select distinct to_timestamp(EXTRACT(YEAR FROM signup_date) || ' ' || EXTRACT(WEEK FROM signup_date),'IYYY IW')::date as test_week
from facts
) w left outer join
facts f
on f.record_type = rt.record_type and
w.test_week = to_timestamp(EXTRACT(YEAR FROM f.signup_date) || ' ' || EXTRACT(WEEK FROM f.signup_date),'IYYY IW')::date
group by rt.record_type, w.test_week
order by w.test_week, rt.record_type;
答案 1 :(得分:1)
select
coalesce(sum(data), 0) as "Sum",
to_char(date_trunc('week', c.signup_date), 'YYYY-MM-DD') as test_week,
c.record_type as user_type
from
facts f
right join
(
(
select distinct record_type
from facts
) f1
cross join
(
select distinct signup_date
from facts
) f2
) c on f.record_type = c.record_type and f.signup_date = c.signup_date
group by 2, 3
order by 2, 3
;
Sum | test_week | user_type
-----+------------+-----------
4 | 2013-09-02 | x
5 | 2013-09-02 | y
0 | 2013-09-09 | x
10 | 2013-09-09 | y
2 | 2013-09-16 | x
1 | 2013-09-16 | y
答案 2 :(得分:0)
在自己玩了一些SQL之后,我有另一种解决方案也可以使用。我非常确定这个查询的性能不如Clodoaldo Neto或Gordon Linoff,但我认为我还要分享另一种形式的SQL来解决这个问题:
select coalesce(data, 0), rt as record_type, weeks
from
(select sum(data) as data, record_type, to_timestamp(EXTRACT(YEAR FROM signup_date) || ' ' || EXTRACT(WEEK FROM signup_date),'IYYY IW')::date as test_week
from facts
group by record_type, test_week
order by record_type, test_week) as facts
right join
(select distinct to_timestamp(EXTRACT(YEAR FROM signup_date) || ' ' || EXTRACT(WEEK FROM signup_date),'IYYY IW')::date as weeks, rts.rt as rt
from facts
cross join (select distinct record_type from facts) as rts (rt)
cross join (select distinct alt from facts) as alts (at)) as dates
on dates.weeks = facts.test_week
and dates.rt = facts.record_type