在生成的列上进行分组的案例查询

时间:2018-08-01 23:22:45

标签: database postgresql data-warehouse snowflake-datawarehouse

这是我正在处理的一些伪SQL的示例。

select count(*) as "count", time2.iso_timestamp - time1.iso_timestamp 
as "time_to_active",
case
when ("time_to_active" >= 1day and "time_to_active" <= 5days) then '1'
when ("time_to_active" >= 6days and "time_to_active" <= 11days) then 
'2'
when ("time_to_active" >= 12days and "time_to_active" <= 20days) then 
'3'
when ("time_to_active" >= 21days and "time_to_active" <= 30days) then 
'4'
when ("time_to_active" >= 31days) then '5'
end as timetoactivegroup
from t
inner join t1 on t.p_id = t1.p_id
join timestamp time1 on t.timestamp_id = t1.id
join timestamp time2 on t1.timestamp_id = t2.id

本质上,我试图将计算出的列适合某个范围的组进行查询。在 n y 天之间的顺序有点类似。我遇到的主要问题是根据分组生成计数。

我可以选择查询来显示计算出的值而没有问题。

2 个答案:

答案 0 :(得分:1)

postgresql不允许您按别名分组,因此您需要在group by子句中重复分组表达式。

data2

或者您可以按列号分组:

GROUP BY case
when ("time_to_active" >= 1day and "time_to_active" <= 5days) then '1'
when ("time_to_active" >= 6days and "time_to_active" <= 11days) then 
'2'
when ("time_to_active" >= 12days and "time_to_active" <= 20days) then 
'3'
when ("time_to_active" >= 21days and "time_to_active" <= 30days) then 
'4'
when ("time_to_active" >= 31days) then '5'
end 

答案 1 :(得分:1)

忽略伪SQL(时间码),并忽略表联接(在该联接中您指的是未命名的表T2

因此,如果您有一些行带有两个时间戳记timestamp_a,这些时间戳记早于timestamp_b,那么我看到的错误可能是通过将差异作为选定列{{1 }}您有两列需要分组,但是您实际上并不想在答案中使用time2.iso_timestamp - time1.iso_timestamp as "time_to_active",,否则,将答案汇总在一起的case块没有多大意义。

如果我有一个带有几行的表(这只代表您联接的表的外观。),则在雪花中。

time_to_active

提供create or replace table t (timestamp_a timestamp_ntz, timestamp_b timestamp); insert into t values ('2018-11-10','2018-11-11') ,('2018-11-08','2018-11-11') ,('2018-10-08','2018-11-11'); select datediff('day', timestamp_a, timestamp_b) as time_to_active from t; ,从而将其包装到子选择中(也可以表示为CTE)

1,3,34

给予:

select case when (time_to_active >= 1 and time_to_active < 6) then '1'
          when (time_to_active >= 6 and time_to_active < 12) then '2'
          when (time_to_active >= 12 and time_to_active < 21) then '3'
          when (time_to_active >= 21 and time_to_active < 31) then '4'
          when (time_to_active >= 31) then '5'
    end as time_to_active_group
    ,count(*) as count 
from (
    select datediff('day', timestamp_a, timestamp_b) as time_to_active from t
) as A
group by time_to_active_group;

因为我们在> = 31存储桶中的1-5和1之间有2行。

另一个陷阱是,您是否无法处理“同一天”或结束时间早于开始时间(也就是 1, 2 5, 1

)的时间戳