MySQL - 将重复的行添加到存档表,然后删除重复的行

时间:2013-10-18 19:08:35

标签: mysql

我一直在研究基于特定字段查找重复行的正确方法。我想我需要更多的帮助 -

 SELECT * 
    FROM enrollees
    INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
    ON enrollees.first_name = b.first_name
        AND enrollees.last_name = b.last_name 
        AND enrollees.address1 = b.address1
        AND enrollees.city = b.city
        AND enrollees.state = b.state
        AND enrollees.zip = b.zip 
        AND count > 1 
        AND enrollees.program_instance_id = b.program_instance_id 
        AND enrollees.id != MinId;

目标是获取重复项并将它们放入存档表(enrollees_duplicates),然后从实时表(登记者)中删除重复项。我尝试编写一个查询来查找并插入重复的行,但它给了我以下错误:

“列数与第1行的值计数不匹配”

我尝试使用的查询是:

INSERT INTO enrollees_duplicates (SELECT * 
    FROM enrollees
    INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
    ON enrollees.first_name = b.first_name
        AND enrollees.last_name = b.last_name 
        AND enrollees.address1 = b.address1
        AND enrollees.city = b.city
        AND enrollees.state = b.state
        AND enrollees.zip = b.zip 
        AND count > 1 
        AND enrollees.program_instance_id = b.program_instance_id 
        AND enrollees.id != MinId);

我认为是因为我没有检索INNER JOIN选择中的所有列?如果是这种情况,如果我将其更改为SELECT *(使用MinId和count添加),它是否仍会抛出相同的错误,因为新表中将不存在两个额外的列?

有没有办法用SQL查询完成所有工作而不必选择重复项,将它们存储在PHP数组中,然后使用另一个SQL查询来拉取每一行,将其插入到重复的表中,以及然后另一个 SQL查询删除重复的行。

我的意图是使用两个查询。一个用于将所有重复行插入到归档表中,另一个用于删除重复行。如果它可以以某种方式被制成一个查找重复项的查询,将它们插入到存档表中,然后将它们删除 - 所有这些都在一次运行中,这样会更好。

对这个领域不熟悉,任何帮助或指导都将不胜感激。

2 个答案:

答案 0 :(得分:0)

  

“列数与第1行的值计数不匹配”

表enrollees_duplicates和enrollees具有不同的结构。

使用ON DELETE TRIGGER可能更好吗? (http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html)。

答案 1 :(得分:0)

我的问题的解决方案是,当我的第一个选择只是'*'时,它将两个额外的列(MinId,count)添加到结果中,这使得列数不同。通过仅抓取'enrollees'表的结果而不是子查询的其他参数,它会纠正列差异。

INSERT INTO enrollees_duplicates (SELECT enrollees.* 
    FROM enrollees
    INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
    ON enrollees.first_name = b.first_name
        AND enrollees.last_name = b.last_name 
        AND enrollees.address1 = b.address1
        AND enrollees.city = b.city
        AND enrollees.state = b.state
        AND enrollees.zip = b.zip 
        AND count > 1 
        AND enrollees.program_instance_id = b.program_instance_id 
        AND enrollees.id != MinId);