在连接表中仅匹配每个记录一次

时间:2018-02-13 11:26:28

标签: sql postgresql

我有两张桌子。第一个inv包含发票记录,第二个包含payments。我希望invinv_amount匹配inv_date表格中的付款。同一天可能有多个发票具有相同的金额,同一天也可能有多个相同金额的发票。

付款应与第一个匹配的发票匹配,每笔付款只能匹配一次。

这是我的数据:

inv

 inv_id | inv_amount |  inv_date  | inv_number
--------+------------+------------+------------
      1 |         10 | 2018-01-01 |          1
      2 |         16 | 2018-01-01 |          1
      3 |         12 | 2018-02-02 |          2
      4 |         14 | 2018-02-03 |          3
      5 |         19 | 2018-02-04 |          3
      6 |         19 | 2018-02-04 |          5
      7 |          5 | 2018-02-04 |          6
      8 |         40 | 2018-02-04 |          7
      9 |         19 | 2018-02-04 |          8
     10 |         19 | 2018-02-05 |          9
     11 |         20 | 2018-02-05 |         10
     12 |         20 | 2018-02-07 |         11

pay

 pay_id | pay_amount |  pay_date
--------+------------+------------
      1 |         10 | 2018-01-01
      2 |         12 | 2018-02-02
      4 |         19 | 2018-02-04
      3 |         14 | 2018-02-03
      5 |          5 | 2018-02-04
      6 |         19 | 2018-02-04
      7 |         19 | 2018-02-05
      8 |         20 | 2018-02-07

我的查询:

 SELECT DISTINCT ON (inv.inv_id) inv.inv_id,
    inv.inv_amount,
    inv.inv_date,
    inv.inv_number,
    pay.pay_id
   FROM ("2016".pay
     RIGHT JOIN "2016".inv ON (((pay.pay_amount = inv.inv_amount) AND (pay.pay_date = inv.inv_date))))
  ORDER BY inv.inv_id

导致:

 inv_id | inv_amount |  inv_date  | inv_number | pay_id
--------+------------+------------+------------+--------
      1 |         10 | 2018-01-01 |          1 |      1
      2 |         16 | 2018-01-01 |          1 |
      3 |         12 | 2018-02-02 |          2 |      2
      4 |         14 | 2018-02-03 |          3 |      3
      5 |         19 | 2018-02-04 |          3 |      4
      6 |         19 | 2018-02-04 |          5 |      4
      7 |          5 | 2018-02-04 |          6 |      5
      8 |         40 | 2018-02-04 |          7 |
      9 |         19 | 2018-02-04 |          8 |      6
     10 |         19 | 2018-02-05 |          9 |      7
     11 |         20 | 2018-02-05 |         10 |
     12 |         20 | 2018-02-07 |         11 |      8

记录inv_id = 6不应与pay_id = 4匹配,因为这意味着付款4已插入两次

期望的结果:

inv_id | inv_amount |  inv_date  | inv_number | pay_id
--------+------------+------------+------------+--------
      1 |         10 | 2018-01-01 |          1 |      1
      2 |         16 | 2018-01-01 |          1 |
      3 |         12 | 2018-02-02 |          2 |      2
      4 |         14 | 2018-02-03 |          3 |      3
      5 |         19 | 2018-02-04 |          3 |      4
      6 |         19 | 2018-02-04 |          5 |        <- should be empty**
      7 |          5 | 2018-02-04 |          6 |      5
      8 |         40 | 2018-02-04 |          7 |
      9 |         19 | 2018-02-04 |          8 |      6
     10 |         19 | 2018-02-05 |          9 |      7
     11 |         20 | 2018-02-05 |         10 |
     12 |         20 | 2018-02-07 |         11 |      8

免责声明:是的我昨天用原始数据提出了这个问题,但有人指出我的sql很难阅读。因此,我试图创建一个更清晰的问题表示。

为方便起见,这里有一个要测试的SQL小提琴:http://sqlfiddle.com/#!17/018d7/1

1 个答案:

答案 0 :(得分:1)

看到这个例子后,我想我已经找到了你的问题:

WITH payments_cte AS (
    SELECT
        payment_id,
        payment_amount,
        payment_date,
        ROW_NUMBER() OVER (PARTITION BY payment_amount, payment_date ORDER BY payment_id) AS payment_row
    FROM payments
), invoices_cte AS (
    SELECT
        invoice_id,
        invoice_amount,
        invoice_date,
        invoice_number,
        ROW_NUMBER() OVER (PARTITION BY invoice_amount, invoice_date ORDER BY invoice_id) AS invoice_row
    FROM invoices
)
SELECT invoice_id, invoice_amount, invoice_date, invoice_number, payment_id
FROM invoices_cte
LEFT JOIN payments_cte
    ON payment_amount = invoice_amount
    AND payment_date = invoice_date
    AND payment_row = invoice_row
ORDER BY invoice_id, payment_id