多对多左连接,其中基于最小日期差来连接行

时间:2018-11-20 05:13:20

标签: mysql sql

我正在尝试将两个表合并在一起,一个表包含交易数据,另一个来自旅行订单。我的目标是将交易表中的每个记录与旅行订单表中的单个记录相关联,其中交易购买日期和旅行订单开始日期之间的差异最小。这些表由SSN连接,并且我想使用左连接,因为我希望将所有记录保留在交易数据中,即使没有针对SSN的旅行订单。

以下是一些示例数据:

 /* Create and populate transactions */
    CREATE TABLE transactions (
        ssn INT,
        purchase_date DATE,
        price FLOAT
    );

    INSERT INTO transactions VALUES 
        (1111, "2018-12-31", 12.20),
        (1111, "2018-11-01", 22.23),
        (2222, "2018-08-17", 99.23),
        (4444, "2018-06-07", 13.22),
        (5555, "2018-03-05", 22.22),
        (6666, "2018-05-29", 11.11),
        (7777, "2018-10-10", 23.32),
        (8888, "2018-06-21", 44.44),
        (8888, "2018-01-19", 55.55),
        (8888, "2018-02-25", 66.53);

    /* Create and populate travel orders */
    CREATE TABLE travel_orders (
        ssn_id INT NOT NULL,
        start_date DATE
        );

    INSERT INTO travel_orders VALUES 
        (1111, "2018-12-28"),
        (1111, "2018-12-07"),
        (2222, "2018-08-12"),
        (7777, "2018-10-10"),
        (7777, "2018-10-14"),
        (8888, "2018-06-18"),
        (8888, "2018-01-19"),
        (8888, "2018-02-22");

例如,交易表记录

(8888, "2018-06-21", 44.44)

将加入旅行订单记录

(8888, "2018-06-18")

以此类推,以显示剩余的记录。

编辑:预期的输出类似于:

+------+---------------+-------+--------+------------+
| ssn  | purchase_date | price | ssn_id | start_date |
+------+---------------+-------+--------+------------+
| 1111 | 2018-12-31    |  12.2 |   1111 | 2018-12-28 |
| 1111 | 2018-11-01    | 22.23 |   1111 | 2018-12-07 |
| 2222 | 2018-08-17    | 99.23 |   2222 | 2018-08-12 |
| 4444 | 2018-06-07    | 13.22 |   4444 | NULL       |
| 5555 | 2018-03-05    | 22.22 |   5555 | NULL       |
| 6666 | 2018-05-29    | 11.11 |   6666 | NULL       |
| 7777 | 2018-10-10    | 23.32 |   7777 | 2018-10-10 |
| 8888 | 2018-06-21    | 44.44 |   8888 | 2018-06-18 |
| 8888 | 2018-01-19    | 55.55 |   8888 | 2018-01-19 |
| 8888 | 2018-02-25    | 66.53 |   8888 | 2018-02-22 |
+------+---------------+-------+--------+------------+

我有一些基本的入门代码

SELECT t.*,
       o.*
FROM transactions AS t
LEFT JOIN travel_orders AS o
ON t.ssn = o.ssn_id;

但是我需要添加过滤器,其中记录是根据购买日期和开始日期之间的最小日期差来匹配的。

1 个答案:

答案 0 :(得分:2)

此查询将为您提供所需的结果。它为ssn和购买日期(子查询)的每种组合找到购买日期和开始日期之间的最短时间,然后找到JOINtransactionstravel_orders的最短时间。表以获取所需的输出:

SELECT t.*,
       o.*
FROM transactions t
JOIN (SELECT t.ssn, 
             t.purchase_date,
             MIN(ABS(DATEDIFF(o.start_date, t.purchase_date))) AS dd
             FROM transactions t
             LEFT JOIN travel_orders o
             ON t.ssn = o.ssn_id
             GROUP BY t.ssn, t.purchase_date) d
    ON d.ssn = t.ssn AND d.purchase_date = t.purchase_date
LEFT JOIN travel_orders o
    ON o.ssn_id = t.ssn AND ABS(DATEDIFF(o.start_date, t.purchase_date)) = d.dd
ORDER BY t.ssn, t.price

输出:

ssn     purchase_date   price   ssn_id  start_date
1111    2018-12-31      12.2    1111    2018-12-28
1111    2018-11-01      22.23   1111    2018-12-07
2222    2018-08-17      99.23   2222    2018-08-12
4444    2018-06-07      13.22   (null)  (null)
5555    2018-03-05      22.22   (null)  (null)
6666    2018-05-29      11.11   (null)  (null)
7777    2018-10-10      23.32   7777    2018-10-10
8888    2018-06-21      44.44   8888    2018-06-18
8888    2018-01-19      55.55   8888    2018-01-19
8888    2018-02-25      66.53   8888    2018-02-22

demo on SQLFiddle