如何通过查询按顺序获取组的所有日期?

时间:2014-02-12 03:13:28

标签: sql postgresql group-by

我正在Postgres 9.3中的用户活动日志表上编写分析查询。它有一个注册日期,一个数据字段(可以求和)和一个用户类型。我为这个问题构建了一些示例数据/ sql,我希望得到一些帮助来搞清楚最后一部分。测试所需的SQL如下 - 它将删除/创建一个名为facts的表 - 所以一定要在沙盒中工作。

我按周和用户类型汇总数据 - 因此您每周都会获得每种用户类型的数据字段计数。我遇到的问题是我得到的结果是用户类型='x'缺少一周。由于用户类型“x”在第9-9-13周没有用户数据,因此不显示任何行(请参阅下面的示例结果)。我希望那个用户类型和周有一行。我想完成这个,如果可能的话,使用单个select语句,没有临时表或维度表(这是因为我将这个sql传递给业务经理,并且单个自包含的SQL select语句有望更加简单(批评这种方法是受欢迎的,但不是答案)。谢谢大家的帮助!

以下是我得到的结果:

Sum     test_week       user_type
4   "2013-09-02"    "x"
5   "2013-09-02"    "y"
10  "2013-09-09"    "y"
2   "2013-09-16"    "x"
1   "2013-09-16"    "y"

这是我想要的结果:

Sum     test_week       user_type
4   "2013-09-02"    "x"
5   "2013-09-02"    "y"
0   "2013-09-09"    "x"
10  "2013-09-09"    "y"
2   "2013-09-16"    "x"
1   "2013-09-16"    "y"

这是测试数据和SQL select语句:

drop table if exists facts;
create temp table facts (signup_date date, data integer, record_type varchar, alt varchar);
insert into facts (signup_date, data, record_type) values
('9/3/2013',1,'x'),
('9/4/2013',1,'y'),
('9/5/2013',2,'x'),
('9/6/2013',3,'y'),
('9/7/2013',1,'x'),
('9/8/2013',1,'y'),
-- note the week of 9/9 to 9/16 has no 'x' records
('9/9/2013',2,'y'),
('9/10/2013', 3, 'y'),
('9/11/2013', 4, 'y'),
('9/12/2013', 1, 'y'),
('9/17/2013', 2, 'x'),
('9/18/2013', 1, 'y');

select coalesce(data, 0), test_week, record_type
  from 
    (select sum(data) as data, record_type, to_timestamp(EXTRACT(YEAR FROM signup_date) || ' ' || EXTRACT(WEEK FROM signup_date),'IYYY IW')::date as test_week
    from facts
    group by record_type, test_week
    ) as facts
  order by test_week, record_type

3 个答案:

答案 0 :(得分:1)

要解决此问题,请创建所有record_type和所有测试周的所有组合的列表。左边从这些组合连接到实际的事实表。这将提供所有记录,因此您应该能够获得没有数据的行:

select coalesce(sum(f.data), 0) as data, rt.record_type, w.test_week
from (select distinct record_type from facts) rt cross join
     (select distinct to_timestamp(EXTRACT(YEAR FROM signup_date) || ' ' || EXTRACT(WEEK FROM signup_date),'IYYY IW')::date as test_week
      from facts
     ) w left outer join
     facts f
     on f.record_type = rt.record_type and
        w.test_week = to_timestamp(EXTRACT(YEAR FROM f.signup_date) || ' ' || EXTRACT(WEEK FROM f.signup_date),'IYYY IW')::date
group by rt.record_type, w.test_week
order by w.test_week, rt.record_type;

答案 1 :(得分:1)

select
    coalesce(sum(data), 0) as "Sum",
    to_char(date_trunc('week', c.signup_date), 'YYYY-MM-DD') as test_week,
    c.record_type as user_type
from
    facts f
    right join
    (
        (
            select distinct record_type
            from facts
        ) f1
        cross join
        (
            select distinct signup_date
            from facts
        ) f2
    ) c on f.record_type = c.record_type and f.signup_date = c.signup_date
group by 2, 3
order by 2, 3
;
 Sum | test_week  | user_type 
-----+------------+-----------
   4 | 2013-09-02 | x
   5 | 2013-09-02 | y
   0 | 2013-09-09 | x
  10 | 2013-09-09 | y
   2 | 2013-09-16 | x
   1 | 2013-09-16 | y

答案 2 :(得分:0)

在自己玩了一些SQL之后,我有另一种解决方案也可以使用。我非常确定这个查询的性能不如Clodoaldo Neto或Gordon Linoff,但我认为我还要分享另一种形式的SQL来解决这个问题:

  select coalesce(data, 0), rt as record_type, weeks
    from 
      (select sum(data) as data, record_type, to_timestamp(EXTRACT(YEAR FROM signup_date) || ' ' || EXTRACT(WEEK FROM signup_date),'IYYY IW')::date as test_week
      from facts
      group by record_type, test_week
      order by record_type, test_week) as facts
    right join 
      (select distinct to_timestamp(EXTRACT(YEAR FROM signup_date) || ' ' || EXTRACT(WEEK FROM signup_date),'IYYY IW')::date as weeks, rts.rt as rt
       from facts
       cross join (select distinct record_type from facts) as rts (rt) 
       cross join (select distinct alt from facts) as alts (at)) as dates
    on dates.weeks = facts.test_week
    and dates.rt = facts.record_type