我需要从表“评分”(https://lagunita.stanford.edu/c4x/DB/SQL/asset/moviedata.html)中获取重复的(具有相同的rID和mID)。我认为仅使用以下代码会很容易:
SELECT
rating.rid, rating.mid
FROM
rating
GROUP BY
rating.rid, rating.mid
HAVING
COUNT(*) > 1
它给了我结果:
rID mID
201 101
203 108
不错,但我需要 ALL 重复的行才能得到此结果,因为尽管它们具有相同的 rID 和 mID 列,但 stars < / em>和 ratingdate 列不同(我需要它们来解决练习的下一部分)。我几乎用以下代码完成了此操作:
SELECT first.rid, first.mid, first.stars
FROM rating AS first
INNER JOIN
(SELECT
rating.rid, rating.mid, stars, COUNT(*)
FROM
rating
GROUP BY
rating.rid, rating.mid
HAVING
COUNT(*) > 1) AS second
ON first.rid = second.rid
输出:
rID mID stars
201 101 2
201 101 4
203 103 2
203 108 4
203 108 2
但是正如您看到的那样,我把我排在了不好的一排:
rID mID stars
203 103 2
我的问题是如何获得此结果:
rID mID stars
201 101 2
201 101 4
203 108 4
203 108 2
答案 0 :(得分:0)
窗口功能可以帮助您
SELECT rId, mId, stars
FROM (SELECT rId, mId, stars
, count(*) OVER (PARTITION BY rId, mId) AS c
FROM rating)
WHERE c > 1
ORDER BY rId, mId
答案 1 :(得分:0)
查询的ON子句中还需要一个条件,因为当rid
和mid
相等时,您将定义重复项:
SELECT first.rid, first.mid, first.stars
FROM rating AS first INNER JOIN (
SELECT rid, mid
FROM rating
GROUP BY rid, mid
HAVING COUNT(*) > 1
) AS second
ON first.rid = second.rid AND first.mid = second.mid
通过删除select语句中不必要的列,我简化了子查询。