Greenplum中的First_value窗口函数

时间:2017-06-28 12:22:32

标签: greenplum

我在Greenplum DB工作。

在First_value窗口函数中获得奇怪的结果当按顺序传递所有行中常见的字符串值时,它总是返回第一次插入的行,但理想情况下它应该返回任何值.Below是我的代码... < / p>

create temporary table test_first_value (id int,statename 
varchar(50),episodeid int,
episodedate date) distributed by (id);

insert into test_first_value values(12,'MP',9863,'2015-11-06');
insert into test_first_value values(12,'MP',98123,'2009-11-06');
insert into test_first_value values(12,'MP',90123,'2017-03-06');
insert into test_first_value values(12,'MP',44567,'2013-03-17');
insert into test_first_value values(13,'MP',189300,'2013-03-17');
insert into test_first_value values(13,'MP',443467,'2016-03-19');

它始终返回相同的值,首先插入的是episodeid = 9863,id = 12,episodeid = 189300,id = 13

Select *,
First_value(episodeid) over(partition by id order by statename) as 
first_episodeid,
First_value(episodedate) over(partition by id order by statename) as 
first_episodedate
from 
test_first_value;

enter image description here

现在,如果我更改了我的插入顺序,那么它将始终返回首先插入的行值,即id = 12的episodeid = 98123和id = 13的episodeid = 443467

delete from test_first_value;

insert into test_first_value values(12,'MP',98123,'2009-11-06');
insert into test_first_value values(12,'MP',90123,'2017-03-06');
insert into test_first_value values(12,'MP',44567,'2013-03-17');
insert into test_first_value values(12,'MP',9863,'2015-11-06');
insert into test_first_value values(13,'MP',443467,'2016-03-19');
insert into test_first_value values(13,'MP',189300,'2013-03-17');

Select *,
First_value(episodeid) over(partition by id order by statename) as 
first_episodeid,
First_value(episodedate) over(partition by id order by statename) as 
first_episodedate
from 
test_first_value;

enter image description here

请帮助我,我做错了。

1 个答案:

答案 0 :(得分:0)

您的代码运行正常。这是你的窗口功能:

First_value(episodeid) over(partition by id order by statename)

当您自己在数据中显示时,id有多行,且statename相同。在这种情况下,数据库将从匹配的键返回任意和不确定的值。

另一种说法是在关系数据库中排序不稳定。原因很简单:表表示无序集。如果排序键完全相同,则无法使用自然排序。

因此,找到另一个键,以便order by唯一标识每一行。这意味着结果将是稳定的,因为所需的行将被唯一标识。在您的数据中,您可以添加episode_date作为order by的第二个密钥。