尝试计算数据表中不同步骤之间的范围,并使用此SQL代码返回每次计算的中值:
SELECT median(datediff(seconds,one,two)) as step_one,
median(datediff(seconds,two,three)) as step_two,
FROM Table
这将返回以下错误消息:
[0A000] [500310] Amazon无效的操作:在ORDER组内 聚合函数的BY子句必须相同; java.lang.RuntimeException: com.amazon.support.exceptions.ErrorException:Amazon无效 操作:组ORDER BY子句中的聚合函数必须 一样;
注意:不过,我可以返回一个中值。
这是我的数据框示例:
one two three
2015-12-14 19:01:58.014247 2015-12-21 17:36:06.187302 2015-12-14 19:10:00.040057 2015-12-14 19:03:18.153519
2016-01-02 05:18:50.351975 2016-01-02 05:26:10.660299 2016-01-02 05:22:58.353365 2016-01-02 05:19:34.915794
2016-02-08 07:29:23.938046 2016-02-08 07:41:42.016819 2016-02-08 07:31:23.899776 2016-02-08 07:30:03.168844
2016-02-25 18:25:39.223014 2016-02-25 18:31:07.087808 2016-02-25 18:29:02.490969 2016-02-25 18:26:20.188472
2015-11-26 12:02:27.033141 2015-11-26 12:07:52.813699 2015-11-26 12:06:33.106484 2015-11-26 12:03:09.152853
2015-12-18 08:44:13.184319 2015-12-18 13:10:51.707354 2015-12-18 13:09:35.938711 2015-12-18 13:02:22.650966
2016-01-31 06:41:55.165849 2016-01-31 06:44:58.004319 2016-01-31 06:43:25.923505 2016-01-31 06:42:29.955232
2016-02-15 12:22:29.051259 2016-02-22 09:29:15.649721 2016-02-22 08:40:45.221558 2016-02-16 06:52:52.368139
期望的结果是一到两与两到三之间的中间时间增量(实际数据中有更多列)
答案 0 :(得分:0)
如果一条语句包括多次调用基于排序的聚合函数(LISTAGG,PERCENTILE_CONT或MEDIAN),则它们必须全部使用相同的ORDER BY值。请注意,MEDIAN对表达式值应用隐式顺序。
来自 https://docs.aws.amazon.com/redshift/latest/dg/r_PERCENTILE_CONT.html
答案 1 :(得分:0)
对于此查询,由于没有分组依据,您可以将查询分为两部分:
Select step_one,
step_two
From
(SELECT median(datediff(seconds,one,two)) as step_one
FROM Table) as a,
(SELECT median(datediff(seconds,two,three)) as step_two,
FROM Table) as b
但是在更复杂的情况下,有一部分由选择组成,我找到了解决此问题的方法。请考虑下表:
create table test321 (i int, j int, k int, l int);
insert into test321 values(null, null, null, null);
insert into test321 values(null, 13, null, null);
insert into test321 values(17, null, null, null);
insert into test321 values(null, 15, null, 14);
insert into test321 values(15, null, null, 15);
insert into test321 values(null, 14, 10, null);
insert into test321 values(14, null, 11, null);
insert into test321 values(null, 16, 12, 12);
insert into test321 values(16, null, 13, 13);
insert into test321 values(1, 1, 1, 1);
insert into test321 values(2, 2, 1, 2);
insert into test321 values(3, 3, 1, 3);
insert into test321 values(4, 4, 2, 1);
insert into test321 values(5, 5, 2, 2);
insert into test321 values(6, 6, 2, 3);
insert into test321 values(7, 7, 3, 1);
insert into test321 values(8, 8, 3, 2);
insert into test321 values(9, 9, 3, 3);
insert into test321 values(10, 10, 4, 1);
insert into test321 values(11, 11, 4, 2);
insert into test321 values(12, 12, 4, 3);
假设我们正在寻找:
select k, l, medin(i), median(j)
from test321
group by k, l
那么一般的解决方案是:
Select case when a1.kstatus = -1 then null else a1.k end k,
case when a1.lstatus = -1 then null else a1.l end l,
medi,
medj
From ( Select coalesce(k, (select max(k) k from test321 where k is not null)) k,
case when a.k is not null then 0 else -1 end kstatus,
coalesce(l, (select max(l) l from test321 where l is not null)) l,
case when a.l is not null then 0 else -1 end lstatus,
median(i) medi
From (
select i, j, k, l
from test321
) as a
group by k, l
) as a1
inner join
( Select coalesce(k, (select max(k) l from test321 where k is not null)) k,
case when a.k is not null then 0 else -1 end kstatus,
coalesce(l, (select max(l) l from test321 where l is not null)) l,
case when a.l is not null then 0 else -1 end lstatus,
median(j) medj
From (
select i, j, k, l
from test321
) as a
group by k, l
) as a2
on (
a1.k = a2.k and
a1.l = a2.l and
a1.kstatus = a2.kstatus and
a1.lstatus = a2.lstatus
)
;
希望这会有所帮助。