Postgresql row_number没有按日期排序

时间:2014-01-28 16:53:34

标签: postgresql window-functions row-number

我正在制作postgres 9.3.2,我有这张表:

id  startdate   enddate    no_of_days_between

1   2010-12-22  2010-12-23  1
1   2010-12-23  2010-12-24  1
1   2010-12-24  2010-12-25  1
1   2010-12-25  2010-12-26  1
1   2010-12-26  2010-12-27  1
1   2010-12-27  2010-12-28  1
1   2010-12-28  2010-12-29  1
1   2010-12-29  2011-03-06  67
1   2011-03-06  2011-03-07  1
1   2011-03-07  2011-03-08  1
1   2011-03-08  2011-03-09  1

我想做的就是找到连续几天的连胜纪录。为此,我在此查询中使用row_number窗口函数:

select t.*, row_number() over (partition by no_of_days_between order by enddate) as no_of_consecutive_days from t

我想要的是这样的:

id  startdate   enddate    no_of_days    no_of_consecutive_days 
                            _between
1   2010-12-22  2010-12-23  1            1
1   2010-12-23  2010-12-24  1            2
1   2010-12-24  2010-12-25  1            3
1   2010-12-25  2010-12-26  1            4
1   2010-12-26  2010-12-27  1            5
1   2010-12-27  2010-12-28  1            6
1   2010-12-28  2010-12-29  1            7 
1   2010-12-29  2011-03-06  67           1
1   2011-03-06  2011-03-07  1            1
1   2011-03-07  2011-03-08  1            2
1   2011-03-08  2011-03-09  1            3

然而,查询返回的内容更像是首先按no_of_days_between排序,然后是enddate,所以我回来了:

id  startdate   enddate    no_of_days    no_of_consecutive_days 
                            _between
1   2010-12-22  2010-12-23  1            1
1   2010-12-23  2010-12-24  1            2
1   2010-12-24  2010-12-25  1            3
1   2010-12-25  2010-12-26  1            4
1   2010-12-26  2010-12-27  1            5
1   2010-12-27  2010-12-28  1            6
1   2010-12-28  2010-12-29  1            7 
1   2011-03-06  2011-03-07  1            8
1   2011-03-07  2011-03-08  1            9
1   2011-03-08  2011-03-09  1            10
1   2010-12-29  2011-03-06  67           1

之前有没有人遇到这个问题?我怎么能强迫它先订购再分区?

由于

2 个答案:

答案 0 :(得分:2)

您仍然需要在查询结束时使用“ORDER BY enddate”,否则行的顺序就是postgres给您的感觉。

OVER子句中的ORDER BY仅控制row_number()查看数据的方式,而不是最终返回数据的方式。

答案 1 :(得分:0)

根据SQL标准,这似乎是预期的行为。 我必须编写一个函数来实现我想要它做的事情 - 这是第一个按日期排序,然后是分区。这意味着每次存在其他数字时,计数器都会重置。

这对连续一天的连胜非常有用。

我的功能代码:

CREATE TYPE consecutive_length_type AS
   (daystreak integer,
    streakstart date,
    streakend date;

CREATE OR REPLACE FUNCTION get_maxconsecutive_day_streak() RETURNS consecutive_length_type AS
$BODY$
declare 
    max_length integer := 0;
    end_date date;
    cons_days integer :=0;
    return_rec consecutive_length_type;
    rec record;
begin

for rec in 
    select * from table t --table as above 
loop
    if rec.no_of_days_between = 1 then 
        cons_days := cons_days + 1 ;
        if cons_days > max_length then
            max_length := cons_days;
            end_date := rec.enddate ; 
                          --this way I can see when the streak ended
        end if;
    else 
        cons_days := 0;
    end if;
end loop;


return_rec.daystreak := max_length;
return_rec.streakend := end_date;
return_rec.streakstart := end_date - max_length; 
                       --I am calculating the day start so I can use it further on

return return_rec;
end;
$BODY$
  LANGUAGE plpgsql ;

我最初拥有的是一张映射到日期的id表。为了计算连续两天之间的天数,我运行此查询:

select id , 
lag(date_logged,1,date_logged) over 
        (partition by id order by id, date_logged  ) as startdate
date_logged as enddate , 
date_logged- lag(date_logged,1,date_logged) over 
        (partition by id order by id, date_logged  ) as no_of_days_between
from table_name;