从重复的行中获取所有信息

时间:2019-06-20 20:47:25

标签: sql

我需要从表“评分”(https://lagunita.stanford.edu/c4x/DB/SQL/asset/moviedata.html)中获取重复的(具有相同的rID和mID)。我认为仅使用以下代码会很容易:

SELECT 
     rating.rid, rating.mid
FROM 
     rating 
GROUP BY
     rating.rid, rating.mid
HAVING
     COUNT(*) > 1

它给了我结果:

rID mID
201 101
203 108

不错,但我需要 ALL 重复的行才能得到此结果,因为尽管它们具有相同的 rID mID 列,但 stars < / em>和 ratingdate 列不同(我需要它们来解决练习的下一部分)。我几乎用以下代码完成了此操作:

SELECT first.rid, first.mid, first.stars
FROM rating AS first
                    INNER JOIN 
                                (SELECT 
                                    rating.rid, rating.mid, stars, COUNT(*)
                                FROM 
                                    rating  
                                GROUP BY
                                    rating.rid, rating.mid
                                HAVING
                                    COUNT(*) > 1) AS second
                                ON first.rid = second.rid

输出:

rID  mID stars
201  101   2
201  101   4
203  103   2
203  108   4
203  108   2

但是正如您看到的那样,我把我排在了不好的一排:

rID  mID stars
203  103   2

我的问题是如何获得此结果:

rID  mID stars
201  101   2
201  101   4
203  108   4
203  108   2

2 个答案:

答案 0 :(得分:0)

窗口功能可以帮助您

SELECT rId, mId, stars
FROM (SELECT rId, mId, stars
           , count(*) OVER (PARTITION BY rId, mId) AS c
      FROM rating)
WHERE c > 1
ORDER BY rId, mId

答案 1 :(得分:0)

查询的ON子句中还需要一个条件,因为当ridmid相等时,您将定义重复项:

SELECT first.rid, first.mid, first.stars
FROM rating AS first INNER JOIN (
  SELECT rid, mid
  FROM rating  
  GROUP BY rid, mid
  HAVING COUNT(*) > 1
) AS second 
ON first.rid = second.rid AND first.mid = second.mid

通过删除select语句中不必要的列,我简化了子查询。