如何根据psql中其他列的值删除一列中的重复项

时间:2019-11-11 17:21:11

标签: sql postgresql greatest-n-per-group

我有一个应该模仿图书馆管理系统的数据库。我想编写一个查询,提供一个表,该表显示每个发行商的前三本借书,还显示它们的相应排名(因此,从发行人X借书最多的书将显示第1名)。我有一个查询,其中显示以下信息-借书的书名及其对应的出版商,以及每本书的借书次数。如你看到的;布卢姆斯伯里(英国)出席了7次(每本《哈利·波特》一本),但是我希望它仅显示3部关于借阅次数最多的哈利·波特书籍。非常感谢您的帮助。

                  title                   |       publisher        | times
------------------------------------------+------------------------+------
 Harry Potter and the Philosopher's Stone | Bloomsbury (UK)        |    2
 Harry Potter and the Deathly Hallows     | Bloomsbury (UK)        |    2
 Harry Potter the Goblet of Fire          | Bloomsbury (UK)        |    3
 The Fellowship of the Ring               | George Allen & Unwin   |    1
 Calculus                                 | Paerson Addison Wesley |    1
 Go Set a Watchman                        | HarperCollins          |    1
 Harry Potter the Half-Blood Prince       | Bloomsbury (UK)        |    4
 Harry Potter and the Chamber of Secrets  | Bloomsbury (UK)        |    3
 Harry Potter and Prisoner of Azkaban     | Bloomsbury (UK)        |    2
 Nineteen Eighty-Four                     | Secker & Warburg       |    1
 Harry Potter the Order of the Phoenix    | Bloomsbury (UK)        |    4
 To Kill a Mockingbird                    | J.B.Lippincott & Co    |    1

下面的查询将生成上面的视图。

SELECT title, publisher, COUNT(borrowed.resid) AS rank 
FROM borrowed 
  CROSS JOIN book 
  CROSS JOIN bookinfo 
WHERE borrowed.resid = book.resid 
  AND book.isbn = bookinfo.isbn 
  AND book.copynumber = borrowed.copynumber 
GROUP BY title, publisher;

2 个答案:

答案 0 :(得分:0)

SELECT title, publisher, times
FROM (
    SELECT *, RANK() OVER (PARTITION BY publisher ORDER BY times DESC) AS ranking
    FROM (
        SELECT title, publisher, COUNT(resid) AS times 
        FROM borrowed 
        JOIN book USING (resid, copynumber)
        JOIN bookinfo USING (isbn)
        GROUP BY title, publisher
    ) AS counts
) AS ranks
WHERE ranking <= 3
ORDER BY publisher, times DESC

counts是您编写的部分,已调整为利用USING从两侧合并相同的命名列(使其更短)

ranks是使用rank功能(窗口功能)按发布者排名的部分

最后,我们通过选择等于或低于3的排名来进入前3名。

答案 1 :(得分:0)

修复联接并添加RANK:

select *
from 
 (
    SELECT title, publisher, COUNT(*) AS cnt,
       -- rank the counts
       rank() over (partition by publisher order by count(*) desc) as rnk 
    FROM borrowed 
      JOIN book 
        ON borrowed.resid = book.resid 
       AND book.copynumber = borrowed.copynumber 
      JOIN bookinfo 
        ON book.isbn = bookinfo.isbn 
    GROUP BY title, publisher
 ) as dt
where rnk <= 3

您可能希望切换到ROW_NUMBER(恰好3行)或DENSE_RANK(3个最高计数)而不是RANK(3行,如果#4 +行得到了,可能更多与第3行相同)。