从多个表中获取合并的行

时间:2019-07-29 01:43:33

标签: sql postgresql greatest-n-per-group

我有一个主表和一些子表,这些子表至少包含主表中的1列。子表是主表某些部分的更新。我想在特定日期获取主表的更新行。

Main table:

table1
| id | colA | colB | colC | colD | colE | createDate  |
|:---|:----:|:----:|:----:|:----:|:----:|:-----------:|
| a1 |  1   |  1   |  1   |  1   |  1   |  2017/01/01 |

Sub-tables :

table2
| mainid | colA | colB | createdate  |
|:------:|:----:|:----:|:-----------:|
|   a1   |  2   |  2   |  2018/05/01 |
|   a1   |  3   |  3   |  2019/01/01 |
|   a1   |  4   |  4   |  2020/01/01 |

table3
| mainid | colA | colB | colC | createDate  |
|:------:|:----:|:----:|:----:|:-----------:|
|   a1   |  6   |  6   |  6   |  2019/01/01 |
|   a1   |  7   |  7   |  7   |  2020/01/01 |
|   a1   |  8   |  8   |  8   |  2021/01/01 |

table4
| mainid | colA | colE | colC | createDate  |
|:------:|:----:|:----:|:----:|:-----------:|
|   a1   |  9   |  9   |  9   |  2018/06/01 |
|   a1   |  10  |  10  |  10  |  2017/01/01 |
|   a1   |  12  |  12  |  12  |  2020/01/01 |

我通过以下代码从每个表中获取行:

select * from table2 where createDate < '2018-07-01' and mainid='a1' order by createDate desc limit 1;
select * from table3 where createDate < '2018-07-01' and mainid='a1' order by createDate desc limit 1;
select * from table4 where createDate < '2018-07-01' and mainid='a1 'order by createDate desc limit 1;

select * from table1 where id = 'a1'; 

现在,我想将这些行与主表的行合并。如果来自1个特定列的不同表中有多个值,则应使用这样的最新行:

table1 -> colD: 1
table2 -> colB: 2
table3 -> nothing
table4 -> colA: 9, colC: 9, colE: 9

 selected row :
| id | colA | colB | colC | colD | colE |filteredDate |
|:---|:----:|:----:|:----:|:----:|:----:|:-----------:|
| a1 |  9   |  2   |  9   |  1   |  9   |  2018/07/01 |

如何在一个查询中完成此操作?这可能吗?我应该以其他方式尝试吗?

1 个答案:

答案 0 :(得分:0)

假设所有列均为NOT NULL,否则您将需要做更多的事情。

首先,您可以UNION ALL列出的查询,并为缺少的列填写NULL值以获得兼容的行类型。然后聚合。剩下的困难是在Postgres股票中没有实现该任务的完美聚合功能...

使用标准的库存SQL工具

SELECT id
    , (array_agg(colA ORDER BY colA IS NULL, createDate DESC))[1] AS colA
    , (array_agg(colB ORDER BY colB IS NULL, createDate DESC))[1] AS colB
    , (array_agg(colC ORDER BY colC IS NULL, createDate DESC))[1] AS colC
    , (array_agg(colD ORDER BY colD IS NULL, createDate DESC))[1] AS colD
    , (array_agg(colE ORDER BY colE IS NULL, createDate DESC))[1] AS colE
FROM (
   select      id, colA, colB, colC, colD, colE, createDate from table1 where id = 'a1'
   UNION ALL
   (select mainid, colA, colB, NULL, NULL, NULL, createDate from table2 where createDate < '2018-07-01' and mainid='a1' order by createDate desc limit 1)
   UNION ALL
   (select mainid, colA, colB, colC, NULL, NULL, createDate from table3 where createDate < '2018-07-01' and mainid='a1' order by createDate desc limit 1)
   UNION ALL
   (select mainid, colA, NULL, colc, NULL, colE, createDate from table4 where createDate < '2018-07-01' and mainid='a1' order by createDate desc limit 1)
   ) sub
GROUP BY 1;

使用自定义聚合函数first()

借助自定义聚合函数(如Postgres Wiki here中所述)更简单,更快:

CREATE OR REPLACE FUNCTION first_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE SQL IMMUTABLE STRICT AS 'SELECT $1';

CREATE AGGREGATE FIRST (
        sfunc    = first_agg,
        basetype = anyelement,
        stype    = anyelement
);

然后:

SELECT id
     , first(colA) AS colA
     , first(colB) AS colB
     , first(colC) AS colC
     , first(colD) AS colD
     , first(colE) AS colE
FROM (
   SELECT      id, colA, colB, colC, colD, colE, createDate FROM table1 WHERE     id='a1'
   UNION ALL
   (SELECT mainid, colA, colB, NULL, NULL, NULL, createDate FROM table2 WHERE mainid='a1' AND createDate < '2018-07-01' ORDER BY createDate DESC LIMIT 1)
   UNION ALL
   (SELECT mainid, colA, colB, colC, NULL, NULL, createDate FROM table3 WHERE mainid='a1' AND createDate < '2018-07-01' ORDER BY createDate DESC LIMIT 1)
   UNION ALL
   (SELECT mainid, colA, NULL, colc, NULL, colE, createDate FROM table4 WHERE mainid='a1' AND createDate < '2018-07-01' ORDER BY createDate DESC LIMIT 1)
   ORDER BY createDate DESC
   ) sub
GROUP  BY 1;

使用this additional module提供的C实现更快速。

db <>提琴here

相关,以及更多详细信息和选项: