从关系中删除重复的行

时间:2015-09-10 23:05:21

标签: postgresql duplicates rows

我有以下代码产生关系:

SELECT book_id, shipments.customer_id
FROM shipments 
LEFT JOIN editions ON (shipments.isbn = editions.isbn)
LEFT JOIN customers ON (shipments.customer_id = customers.customer_id)

在这种关系中,有customer_ids以及他们购买的书籍book_ids。我的目标是与其中的每本书建立关系,然后创建有多少独特客户购买它。我假设实现此目的的一种方法是消除关系中的所有重复行,然后计算每个book_id的实例。 所以我的问题是:如何从这种关系中删除所有重复的行?

谢谢!

编辑:所以我的意思是我希望关系中的所有行都是唯一的。例如,如果有三个相同的行,则应删除其中两个。

1 个答案:

答案 0 :(得分:0)

副本位于表shipments中。您可以使用DISTINCT子句删除这些内容,然后在外部查询中计算GROUP BY isbn:

SELECT isbn, count(customer_id) AS unique_buyers
FROM (
  SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer
GROUP BY isbn;

如果您想要所有图书的清单,即使没有购买,也应该LEFT JOIN以上所有图书清单:

SELECT isbn, coalesce(unique_buyers, 0) AS books_sold_to_unique_buyers
FROM editions
LEFT JOIN (
  SELECT isbn, count(customer_id) AS unique_buyers
  FROM (
    SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer
  GROUP BY isbn) books_bought USING (isbn)
ORDER BY isbn;

你可以通过在计算之前加入来更简洁地写出来:

SELECT isbn, count(customer_id) AS books_sold_to_unique_buyers
FROM editions
LEFT JOIN (
  SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer USING (isbn)
GROUP BY isbn
ORDER BY isbn;