我有这个sql查询,它通过movie_name选择它在表中找到的重复项:
SQL:
SELECT movies.movie_name, movies.year FROM movies
INNER JOIN (SELECT movie_name FROM movies
GROUP BY movie_name HAVING count(movie_id) > 1) dup ON movies.movie_name = dup.movie_name
// want also to test for same year, not just movie_name i.e movies.year = dup.year
这可能吗?
答案 0 :(得分:1)
SELECT movies.movie_name, movies.year FROM movies
INNER JOIN (SELECT movie_name, year FROM movies
GROUP BY movie_name,year HAVING count(movie_id) > 1) dup ON movies.movie_name = dup.movie_name
and movies.year = dup.year
似乎是一个合理的开始...
删除一个我认为你的意思是保留一个,不要忘记你可以有一个以上的重复
让我们说我们会保留第一个摆脱其余的,而最早的movie_id是第一个
所以
Select Min(Movie_id), Movie_Name, Year From Movies Group By Movie_Name, Year
会给你所有的东西
Select Movie_id,Movie_Name,Year From Movies m
Left Join
(Select Min(Movie_id), Movie_Name, Year From Movies Group By Movie_Name, Year) keep
On keep.movieid = m.movieid Where keep.Movie_Id is null
以上是电影中的所有记录,但不是我们想要保留的所有记录。
这样就可以为我们提供你想要摆脱的所有东西。因为克苏鲁的缘故,不要相信我或你自己!在删除之前进行备份!
Delete m From Movies m
Left Join
(Select Min(Movie_id), Movie_Name, Year From Movies Group By Movie_Name, Year) keep
On keep.movieid = m.movieid Where keep.Movie_Id is null
所以现在我们已经接受了我们证实的查询(你确实证明了它不是你!)而不是选择我们正在删除它们的违规记录。
不要忘记备份!
答案 1 :(得分:0)
我非常喜欢Tony使用MIN
的想法。但是,整个查询可以更简单:
DELETE FROM movies WHERE movie_id NOT IN
(SELECT MIN(movie_id) FROM movies GROUP BY movie_name, year);