如何在PSQL中使用最后一次填充相同ID来填充空值?

时间:2018-10-31 13:06:38

标签: postgresql

我在PostgreSQL中有一个数据框,如下所示,我想要每个id的最新记录,如果每个id的任何最新记录在任何列中都包含NULL值,我想用同一列中的下一个最新值替换它列

数据

id    ingdt         code  gender  address   
1     27-10-2018    NULL  NULL    street1    
1     24-10-2018    1234  NULL    street2
1     20-08-2017    3245  M       street2
2     24-09-2018    NULL  F       Astreet
2     24-10-2018    2857  F       Bstreet
3     24-08-2018    3489  M       NULL
3     22-08-2018    5802  M       Cstreet

预期输出

final_output

id    ingdt         code  gender  address   
1     27-10-2018    1234  M       street1   
2     24-10-2018    2857  F       Bstreet
3     24-08-2018    3489  M       Cstreet

尝试

insert into final_output select * from (
(select code, id from data where code != null order by ingdt limit 1) x join
(select gender, id from data where gender != null order by ingdt limit 1) y join 
(select address, id from data where address != null order by ingdt limit 1)z on y.id=x.id)

1 个答案:

答案 0 :(得分:1)

demo:db<>fiddle

使用window functions可以帮助您:

SELECT DISTINCT
    id, 
    max(ingdt) OVER (PARTITION BY id),
    first_value(code) OVER (PARTITION BY id ORDER BY code IS NULL, ingdt DESC) AS code,
    first_value(gender) OVER (PARTITION BY id ORDER BY gender IS NULL, ingdt DESC) AS gender,
    first_value(address) OVER (PARTITION BY id ORDER BY address IS NULL, ingdt DESC) AS address
FROM mytable
ORDER BY id

解释first_value(...) OVER (...)

窗口功能可以将您的行分为不同的框架。这是通过关键字PARTITION BY完成的。在这种情况下,我将为每个id生成帧。

现在,我正在检查列的值是否为NULL。这给了我truefalse。我正在像任何boolean列一样,首先将false(意味着NOT NULL)排序此结果。如果有许多NOT NULL行,则采用最新行(ingdt DESC)。该排序也分别针对每个帧进行。

first_value()计算排序帧的第一个值。