我们有大约100多万行的大桌子。有人可以帮助如何在表中找到重复数据,并可能将其移动到ARCHIVE
表名:CustomerData
NumberofFields:10
最新的一个应该保留(在该记录中由END_DATE标识为NULL)
此致
答案 0 :(得分:3)
您只需要移动END_DATE不为NULL的行吗?
在单笔交易中:
INSERT INTO archive (column1, column2, ... column10)
SELECT column1, column2, ..., column10
FROM CustomerData
WHERE END_DATE IS NOT NULL
DELETE CustomerData
WHERE END_DATE IS NOT NULL
答案 1 :(得分:0)
假设CustomerData表结构为: CustomerDAta(cust_id,Cust_name,Address_ID,start_time,End_Date,.....,其他7列);
假设有2个客户拥有SAme地址ID以获得重复项。
要插入存档表: -
INSERT INTO archive (column1, column2, ... column10)
SELECT cust_id, start_Date, ...,End_Date
FROM CustomerData
WHERE END_DATE IS NOT NULL
AND Address_ID IN(
SELECT Address_ID FROM
(
SELECT Address(ID),count(Address_ID)
FROM customerDAta
GROUP BY Address_ID
HAVING count(Adddress_ID)>1
)
)
)
要删除: - CustomerDAt表: -
DELETE CustomerData
WHERE END_DATE IS NOT NULL
AND
Address_ID IN(
SELECT Address_ID FROM
(
SELECT Address(ID),count(Address_ID)
FROM customerDAta
GROUP BY Address_ID
HAVING count(Adddress_ID)>1
)
)
INNER SubQuery提取相同的Address_ID列上的重复项,类似于oracle数据库提供的employees表中的DeptID列。