Question

我有以下查询：

WITH t as (
  SELECT date_trunc('hour', time_series) as trunc 
  FROM generate_series('2013-02-27 22:00'::timestamp, '2013-02-28 2:00', 
                       '1 hour') as time_series
  GROUP BY trunc
  ORDER BY trunc
)
SELECT DISTINCT ON(trunc) trunc, id
FROM t
LEFT JOIN (
   SELECT id, created, date_trunc('hour', created) as trunc_u
   FROM event
   ORDER BY created DESC
) u
ON trunc = trunc_u

产生以下结果：

"2013-02-27 22:00:00";
"2013-02-27 23:00:00";2
"2013-02-28 00:00:00";5
"2013-02-28 01:00:00";
"2013-02-28 02:00:00";

表event包含id，created和其他一些列，但只有那些在此处相关。上面的查询为我提供了id每个给定trunc时间段生成的最后一个事件（感谢DISTINCT ON我每个时段都得到了一个很好的聚合）。

现在，如果在给定时间段内没有发生任何事件，则此查询会生成NULL。我希望它返回之前可用的id，即使它来自不同的时间段。即：

"2013-02-27 22:00:00";0
"2013-02-27 23:00:00";2
"2013-02-28 00:00:00";5
"2013-02-28 01:00:00";5
"2013-02-28 02:00:00";5

我确信我错过了一些简单的方法来实现这一目标。有什么建议吗？

Answer 1

你可以混合自我加入和windows functions

简化我对此表采用此示例值：

create table t ( a int, b int);    
insert into t values 
( 1, 1),
( 2, Null),
( 3, Null),
( 4, 2 ),
( 5, Null),
( 6, Null);

在您的查询a中trunc_u而b是您的id。查询是：

with cte as (    
    select 
      t1.a, 
      coalesce( t1.b, t2.b, 0) as b,
      rank() OVER 
       (PARTITION BY t1.a ORDER BY t2.a DESC) as pos
    from t t1 
    left outer join t t2
      on t2.b is not null and
         t2.a < t1.a    
)
select a, b
from cte
where pos = 1;

results：

| A | B |
---------
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
| 6 | 2 |

Answer 2

尝试：

WITH t as (
  SELECT time_series as trunc 
    FROM generate_series('2013-02-27 22:00'::timestamp, '2013-02-28 2:00', 
                         '1 hour') as time_series
)
SELECT DISTINCT ON(t.trunc) t.trunc, e.id
  FROM t
  JOIN event e
    ON e.created < t.trunc 
 ORDER BY t.trunc, e.created DESC

如果太慢 - 告诉我。我会给你一个更快的查询。

如果丢失，PostgreSQL使用前一行的值

2 个答案: