我有一个包含三列的数据库表。 Id,user_id,book_id。在此表中,有一些重复。 user_id应该只有一个book_id记录,但在某些情况下,user_id有几个book_id。已经有几百万条记录,我想知道如何删除任何重复记录。
答案 0 :(得分:1)
尝试以下操作。
SQL SERVER
WITH ORDERED AS
(
SELECT id
ROW_NUMBER() OVER (PARTITION BY [user_id] , [book_id] ORDER BY id ASC) AS rn
FROM
tableName
)
delete from tableName
where id in ( select id from ORDERED where rn != 1)
MYSQL
delete from tableName
where id not in(
select MIN(id)from tableName
group by user_id, book_id
)
根据评论编辑 - 在MySQL中,您无法修改在SELECT部分中使用的同一个表
这将解决问题。
delete from tableName
where id not in(
select temp.temp_id from (
select MIN(id) as temp_id from tableName
group by user_id, book_id
) as temp
)
这将只保留(user_id,book_id)
的一个组合答案 1 :(得分:0)
如果您执行以下声明,它将删除user_ID
的所有重复记录,并为每个ID
仅留下最大的user_ID
DELETE a
FROM tableName a
LEFT JOIN
(
SELECT user_ID, MAX(ID) max_ID
FROM tableName
GROUP BY user_ID
) b ON a.user_ID = b.user_ID AND
a.ID = b.max_ID
WHERE b.max_ID IS NULL
答案 2 :(得分:0)
希望此查询允许您删除重复项:
DELETE bl1 FROM book_log bl1
JOIN book_log bl2
ON (
bl1.id > bl2.id AND
bl1.user_id = bl2.user_id AND
bl1.book_id = bl2.book_id
);