长时间潜伏,第一篇文章(er?)
我需要根据分布在多个表中的属性删除数据库中的某些重复条目。这是我已经完成的方法,但是我确定有更好的方法(我绝不是SQL专家!)。任何指针都很棒。 (当我第二次这样做时,它失败了)
我首先从表中获取所有行,然后按日期排序,然后选择最新的条目(dup1)。然后,我将该列表与基于值(dup2)的另一个表进行匹配。然后最后根据两个表(dup3)中出现的ID创建一个行列表。然后,我想从主表中删除ID在第三临时表中的位置。
首先,我创建了一个临时表:
create temporary table dup1
as
select * from
(SELECT hex(media_id) as asset_id, folder_id, name, ingest_date
FROM media
order by ingest_date DESC) as dup
group by name having count(name)>1 and count(hex(folder_id))>1
然后创建了第二个临时表:
create temporary table dup2
as
SELECT hex(asset_id) as asset_id, value FROM datavalues where name_id = 103 group by value having count(value)>1;
as
SELECT hex(asset_id), value FROM datavalues where name_id = 103 group by value having count(value)>1;
创建了一个最终的临时表,该表合并了之前的两个临时表
create temporary table dup3
as
select dup1.asset_id from dup1
join dup2 on dup2.asset_id = dup1.asset_id
然后从媒体表中删除dup3中存在的所有资产
DELETE FROM media where hex(media_id) in (SELECT * from dup3);