按时间序列填充空数据

时间:2018-02-13 16:36:22

标签: sql postgresql

我有一张这样的表

ts                    item          infoA         infoB         
2018-02-03 12:00:00   A             null          null          
2018-02-03 12:01:00   null          A1            null          
2018-02-03 12:02:00   A             null          null          
2018-02-03 12:03:00   null          null          null          
2018-02-03 12:04:00   null          A2            null           
2018-02-03 12:05:00   null          null          null         
2018-02-03 12:06:00   B             null          null         
2018-02-03 12:07:00   null          null          B1         
2018-02-03 12:08:00   null          null          null         

我想仅在相关项

上用时间序列填充空数据
ts                    item          infoA         infoB         
2018-02-03 12:00:00   A             null          null          
2018-02-03 12:01:00   A             A1            null          
2018-02-03 12:02:00   A             A1            null          
2018-02-03 12:03:00   A             A1            null          
2018-02-03 12:04:00   A             A2            null           
2018-02-03 12:05:00   A             A2            null         
2018-02-03 12:06:00   B             null          null         
2018-02-03 12:07:00   B             null          B1         
2018-02-03 12:08:00   B             null          B1            

我发现了一个AGGREGATE函数GapFill() this 使用该功能,我可以从

获取表格
select t1.ts, t1.item, t2.infoA, t3.infoB 
from 
(select ts,gapfill(item) OVER (ORDER BY ts)) t1 
LEFT JOIN (select ts,gapfill(infoA) OVER (ORDER BY ts) as infoA) on (t1.ts = t2.ts and t1.item='A') t2 
LEFT JOIN (select ts,gapfill(infoB) OVER (ORDER BY ts) as infoB) on (t1.ts = t3.ts and t1.item='B') t3

如果我有很多列,如何简化查询。

1 个答案:

答案 0 :(得分:1)

您想要的是@Configuration public class MyConfig implements ApplicationContextAware { private ApplicationContext context; public void setApplicationContext(ApplicationContext context) { this.context = context; } ... } 上的ignore nulls选项。但Postgres(尚未)支持它。

也许最简单的方法是相关子查询:

lag()

如果您知道值会单调增加或减少,则可以使用select t.ts, coalesce(item, (select t2.item from t t2 where t2.ts < t.ts and t2.item is not null order by t2.ts desc fetch first 1 row only ) ) as item, coalesce(itemA, (select t2.itemA from t t2 where t2.ts < t.ts and t2.itemA is not null order by t2.ts desc fetch first 1 row only ) ) as itemA, coalesce(itemB, (select t2.itemB from t t2 where t2.ts < t.ts and t2.itemB is not null order by t2.ts desc fetch first 1 row only ) ) as itemB from t; max()

使用窗口函数的另一种方法使用相同的想法。通过执行累计计数来标识具有相同值的行组。然后将值分散在行上:

min()