如何连接两个具有多个匹配记录的表?

时间:2018-04-17 15:31:12

标签: sql postgresql

当id和日期对齐时,我想要加入两个表。我想的是:

SELECT * from t1 inner join t2 on t1.id = t2.id and t1.date >= t2.date;

但是两个表中可能存在多个条目,我希望确保最佳记录匹配 - 所以如果表2包含4 / 10,4 / 15和4/20的条目,表1记录了3月15日,4月11日,4月16日和4月21日,然后记录匹配如下:

id | t1.date | t2.date
1  |   3/15  | -----  (No match, wouldn't be returned in the results)
1  |   4/11  | 4/10
1  |   4/16  | 4/15
1  |   4/21  | 4/20

1 个答案:

答案 0 :(得分:1)

一种方法是left join idt1.date >= t2.date上的表格,在有序的(t1.date - t2.date)分区上使用Window函数row_number() ,并选择最小的日期差异,这是第一个分区行:

CREATE TABLE t1 ("id" integer, "date" date, "value" integer);

CREATE TABLE t2 ("id" integer, "date" date);

INSERT INTO t1 VALUES
  (1, '2018-03-15', 10),
  (1, '2018-04-11', 20),
  (1, '2018-04-11', 30),
  (1, '2018-04-16', 30),
  (1, '2018-04-21', 20);

INSERT INTO t2 VALUES
  (1, '2018-04-10'),
  (1, '2018-04-15'),
  (1, '2018-04-20');

WITH q AS (
  SELECT
    t1."id", t1."date" t1_date, t2."date" t2_date, t1."value", row_number() OVER
      (PARTITION BY t1."id", t1."date", t1."value" ORDER BY (t1."date" - t2."date")) row_num
  FROM
    t1 LEFT JOIN t2 ON t1."id" = t2."id" AND t1."date" >= t2."date"
)
SELECT "id", "t1_date", "t2_date", "value" FROM q
WHERE row_num = 1;

--  id |  t1_date   |  t2_date   | value 
-- ----+------------+------------+-------
--   1 | 2018-03-15 |            |    10
--   1 | 2018-04-11 | 2018-04-10 |    20
--   1 | 2018-04-11 | 2018-04-10 |    30
--   1 | 2018-04-16 | 2018-04-15 |    30
--   1 | 2018-04-21 | 2018-04-20 |    20
-- (5 rows)

请注意,如果您希望结果数据集排除t2.date不匹配的行,只需将LEFT JOIN替换为INNER JOIN