每天的最大时间戳

时间:2012-06-12 17:22:25

标签: performance postgresql postgresql-9.1

我有一个事件表:

create table event (id integer primary key, date timestamptz unique);
insert into event (id, date) values
 (22784, '2012-01-01 01:00:00+00'),
 (22785, '2012-01-01 04:00:00+00'),
 (22786, '2012-01-01 07:00:00+00'),
 (22787, '2012-01-01 10:00:00+00'),
 (22788, '2012-01-01 13:00:00+00'),
 (22789, '2012-01-01 16:00:00+00'),
 (22790, '2012-01-01 19:00:00+00'),
 (22791, '2012-01-01 22:00:00+00'),
 (22792, '2012-01-02 01:00:00+00'),
 (22793, '2012-01-02 04:00:00+00'),
 (22794, '2012-01-02 07:00:00+00'),
 (22795, '2012-01-02 10:00:00+00'),
 (22796, '2012-01-02 13:00:00+00'),
 (22797, '2012-01-02 16:00:00+00'),
 (22798, '2012-01-02 19:00:00+00'),
 (22799, '2012-01-02 22:00:00+00'),
 (22800, '2012-01-03 01:00:00+00'),
 (22801, '2012-01-03 04:00:00+00'),
 (22802, '2012-01-03 07:00:00+00'),
 (22803, '2012-01-03 10:00:00+00'),
 (22804, '2012-01-03 13:00:00+00'),
 (22805, '2012-01-03 16:00:00+00'),
 (22806, '2012-01-03 19:00:00+00'),
 (22807, '2012-01-03 22:00:00+00'),
 (22808, '2012-01-04 01:00:00+00'),
 (22809, '2012-01-04 04:00:00+00'),
 (22810, '2012-01-04 07:00:00+00'),
 (22811, '2012-01-04 10:00:00+00'),
 (22812, '2012-01-04 13:00:00+00'),
 (22813, '2012-01-04 16:00:00+00'),
 (22814, '2012-01-04 19:00:00+00'),
 (22815, '2012-01-04 22:00:00+00'),
 (22816, '2012-01-05 01:00:00+00'),
 (22817, '2012-01-05 04:00:00+00'),
 (22818, '2012-01-05 07:00:00+00'),
 (22819, '2012-01-05 10:00:00+00'),
 (22820, '2012-01-05 13:00:00+00'),
 (22821, '2012-01-05 16:00:00+00'),
 (22822, '2012-01-05 19:00:00+00'),
 (22823, '2012-01-05 22:00:00+00'),
 (22824, '2012-01-06 01:00:00+00'),
 (22825, '2012-01-06 04:00:00+00'),
 (22826, '2012-01-06 07:00:00+00'),
 (22827, '2012-01-06 10:00:00+00'),
 (22828, '2012-01-06 13:00:00+00'),
 (22829, '2012-01-06 16:00:00+00'),
 (22830, '2012-01-06 19:00:00+00'),
 (22832, '2012-01-06 22:00:00+00'),
 (22833, '2012-01-07 01:00:00+00'),
 (22834, '2012-01-07 04:00:00+00'),
 (22835, '2012-01-07 07:00:00+00'),
 (22836, '2012-01-07 10:00:00+00'),
 (22837, '2012-01-07 13:00:00+00'),
 (22838, '2012-01-07 16:00:00+00'),
 (22839, '2012-01-07 19:00:00+00'),
 (22840, '2012-01-07 22:00:00+00'),
 (22841, '2012-01-08 01:00:00+00'),
 (22842, '2012-01-08 04:00:00+00'),
 (22843, '2012-01-08 07:00:00+00'),
 (22844, '2012-01-08 10:00:00+00'),
 (22845, '2012-01-08 13:00:00+00'),
 (22846, '2012-01-08 16:00:00+00'),
 (22847, '2012-01-08 19:00:00+00'),
 (22848, '2012-01-08 22:00:00+00'),
 (22849, '2012-01-09 01:00:00+00'),
 (22850, '2012-01-09 04:00:00+00'),
 (22851, '2012-01-09 07:00:00+00'),
 (22852, '2012-01-09 10:00:00+00'),
 (22853, '2012-01-09 13:00:00+00'),
 (22854, '2012-01-09 16:00:00+00'),
 (22855, '2012-01-09 19:00:00+00'),
 (22856, '2012-01-09 22:00:00+00'),
 (22857, '2012-01-10 01:00:00+00'),
 (22858, '2012-01-10 04:00:00+00'),
 (22859, '2012-01-10 07:00:00+00'),
 (22860, '2012-01-10 10:00:00+00'),
 (22861, '2012-01-10 13:00:00+00'),
 (22862, '2012-01-10 16:00:00+00'),
 (22863, '2012-01-10 19:00:00+00'),
 (22864, '2012-01-10 22:00:00+00')
;

我希望从每一天获得最大时间戳的ID。我所拥有的是:

select date::date as date, id
from event e
where date = (
    select max(date)
    from event
    where date_trunc('day', date) = date_trunc('day', e.date)
)
order by date;
    date    |  id   
------------+-------
 2011-12-31 | 22784
 2012-01-01 | 22792
 2012-01-02 | 22800
 2012-01-03 | 22808
 2012-01-04 | 22816
 2012-01-05 | 22824
 2012-01-06 | 22833
 2012-01-07 | 22841
 2012-01-08 | 22849
 2012-01-09 | 22857
 2012-01-10 | 22864
(11 rows)

它按预期工作但性能非常糟糕。有什么建议吗?

3 个答案:

答案 0 :(得分:2)

DISTINCT ON可能表现更好:

SELECT DISTINCT ON (d) date_trunc('day',date)::date as d ,id
FROM event 
ORDER BY d desc, date desc;

如果必须按升序日期排序结果,则需要另一级排序:

SELECT d,id 
FROM (
    SELECT DISTINCT ON (d) date_trunc('day',date)::date as d ,id
    FROM event 
    ORDER BY d desc, date desc
) s
ORDER BY d;

答案 1 :(得分:1)

Select date,id from event
inner join (
    Select Max(date_trunc('day', date) as LatestDay 
    From event
) On LatestDay = date_trunc('day', date)

可能会更好一点,但是这会杀死你的date_trunc函数。这可能是值得的。

CREATE INDEX ON event ((date_trunc('day',date)));

答案 2 :(得分:1)

这应该快得多:

SELECT DISTINCT ON (1)
       date::date, id
FROM   event e
ORDER  BY 1, e.date DESC;  -- e.date, not just date!
  • 升级顺序不需要另一级别的排序。

  • date_trunc()没有必要。到目前为止,铸造工作正在发挥作用。

  • ORDER BY子句中,请确保对e.date进行表限定以选择源列,而不是同名的输出列。由于凌乱的SQL标准,可见性规则 在这里令人困惑。

  • BTW,date在每个SQL标准中都是reserved word。你不打算在现实世界中使用它作为列名,对吗?这个小例子已经让人困惑了。


另一种方法是使用窗口函数:

SELECT DISTINCT ON (1)
       date::date
      ,first_value(id) OVER (PARTITION BY date::date ORDER BY date DESC) AS id
FROM   event
ORDER  BY 1;

更具可读性,但速度相当慢。