返回不同组中时间增量的中值

时间:2019-05-16 14:00:42

标签: sql amazon-redshift

尝试计算数据表中不同步骤之间的范围,并使用此SQL代码返回每次计算的中值:

SELECT median(datediff(seconds,one,two)) as step_one,
       median(datediff(seconds,two,three)) as step_two,
FROM Table

这将返回以下错误消息:

  

[0A000] [500310] Amazon无效的操作:在ORDER组内   聚合函数的BY子句必须相同;   java.lang.RuntimeException:   com.amazon.support.exceptions.ErrorException:Amazon无效   操作:组ORDER BY子句中的聚合函数必须   一样;

注意:不过,我可以返回一个中值。

这是我的数据框示例:

one                                 two                        three    
2015-12-14 19:01:58.014247  2015-12-21 17:36:06.187302  2015-12-14 19:10:00.040057  2015-12-14 19:03:18.153519
2016-01-02 05:18:50.351975  2016-01-02 05:26:10.660299  2016-01-02 05:22:58.353365  2016-01-02 05:19:34.915794
2016-02-08 07:29:23.938046  2016-02-08 07:41:42.016819  2016-02-08 07:31:23.899776  2016-02-08 07:30:03.168844
2016-02-25 18:25:39.223014  2016-02-25 18:31:07.087808  2016-02-25 18:29:02.490969  2016-02-25 18:26:20.188472
2015-11-26 12:02:27.033141  2015-11-26 12:07:52.813699  2015-11-26 12:06:33.106484  2015-11-26 12:03:09.152853

2015-12-18 08:44:13.184319  2015-12-18 13:10:51.707354  2015-12-18 13:09:35.938711  2015-12-18 13:02:22.650966
2016-01-31 06:41:55.165849  2016-01-31 06:44:58.004319  2016-01-31 06:43:25.923505  2016-01-31 06:42:29.955232
2016-02-15 12:22:29.051259  2016-02-22 09:29:15.649721  2016-02-22 08:40:45.221558  2016-02-16 06:52:52.368139

期望的结果是一到两与两到三之间的中间时间增量(实际数据中有更多列)

2 个答案:

答案 0 :(得分:0)

如果一条语句包括多次调用基于排序的聚合函数(LISTAGG,PERCENTILE_CONT或MEDIAN),则它们必须全部使用相同的ORDER BY值。请注意,MEDIAN对表达式值应用隐式顺序。

来自 https://docs.aws.amazon.com/redshift/latest/dg/r_PERCENTILE_CONT.html

答案 1 :(得分:0)

对于此查询,由于没有分组依据,您可以将查询分为两部分:

Select step_one,
       step_two
From
      (SELECT median(datediff(seconds,one,two)) as step_one
       FROM Table) as a,
      (SELECT median(datediff(seconds,two,three)) as step_two,
       FROM Table) as b

但是在更复杂的情况下,有一部分由选择组成,我找到了解决此问题的方法。请考虑下表:


create table test321 (i int, j int, k int, l int);



insert into test321 values(null, null, null, null);
insert into test321 values(null, 13, null, null);
insert into test321 values(17, null, null, null);
insert into test321 values(null, 15, null, 14);
insert into test321 values(15, null, null, 15);


insert into test321 values(null, 14, 10, null);
insert into test321 values(14, null, 11, null);
insert into test321 values(null, 16, 12, 12);
insert into test321 values(16, null, 13, 13);

insert into test321 values(1, 1, 1, 1);
insert into test321 values(2, 2, 1, 2);
insert into test321 values(3, 3, 1, 3);
insert into test321 values(4, 4, 2, 1);
insert into test321 values(5, 5, 2, 2);
insert into test321 values(6, 6, 2, 3);
insert into test321 values(7, 7, 3, 1);
insert into test321 values(8, 8, 3, 2);
insert into test321 values(9, 9, 3, 3);
insert into test321 values(10, 10, 4, 1);
insert into test321 values(11, 11, 4, 2);
insert into test321 values(12, 12, 4, 3);

假设我们正在寻找:

select  k, l, medin(i), median(j)
from    test321
group by  k, l

那么一般的解决方案是:


Select  case when a1.kstatus = -1 then null else a1.k end k,
        case when a1.lstatus = -1 then null else a1.l end l,
        medi,
        medj
From    ( Select  coalesce(k, (select max(k) k from test321 where k is not null)) k,
                  case when a.k is not null then 0 else -1 end kstatus,
                  coalesce(l, (select max(l) l from test321 where l is not null)) l,
                  case when a.l is not null then 0 else -1 end lstatus,
                  median(i) medi
          From    (
                    select  i, j, k, l
                    from    test321
                  ) as a
          group by k, l
        ) as a1
          inner join
        ( Select  coalesce(k, (select max(k) l from test321 where k is not null)) k,
                  case when a.k is not null then 0 else -1 end kstatus,
                  coalesce(l, (select max(l) l from test321 where l is not null)) l,
                  case when a.l is not null then 0 else -1 end lstatus,
                  median(j) medj
          From    (
                    select  i, j, k, l
                    from    test321
                  ) as a
          group by k, l
        ) as a2
          on  (
                a1.k =  a2.k            and
                a1.l =  a2.l            and
                a1.kstatus = a2.kstatus and
                a1.lstatus = a2.lstatus
              )
;

希望这会有所帮助。