我一直在研究基于特定字段查找重复行的正确方法。我想我需要更多的帮助 -
SELECT *
FROM enrollees
INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
ON enrollees.first_name = b.first_name
AND enrollees.last_name = b.last_name
AND enrollees.address1 = b.address1
AND enrollees.city = b.city
AND enrollees.state = b.state
AND enrollees.zip = b.zip
AND count > 1
AND enrollees.program_instance_id = b.program_instance_id
AND enrollees.id != MinId;
目标是获取重复项并将它们放入存档表(enrollees_duplicates),然后从实时表(登记者)中删除重复项。我尝试编写一个查询来查找并插入重复的行,但它给了我以下错误:
“列数与第1行的值计数不匹配”
我尝试使用的查询是:
INSERT INTO enrollees_duplicates (SELECT *
FROM enrollees
INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
ON enrollees.first_name = b.first_name
AND enrollees.last_name = b.last_name
AND enrollees.address1 = b.address1
AND enrollees.city = b.city
AND enrollees.state = b.state
AND enrollees.zip = b.zip
AND count > 1
AND enrollees.program_instance_id = b.program_instance_id
AND enrollees.id != MinId);
我认为是因为我没有检索INNER JOIN选择中的所有列?如果是这种情况,如果我将其更改为SELECT *(使用MinId和count添加),它是否仍会抛出相同的错误,因为新表中将不存在两个额外的列?
有没有办法用SQL查询完成所有工作而不必选择重复项,将它们存储在PHP数组中,然后使用另一个SQL查询来拉取每一行,将其插入到重复的表中,以及然后另一个 SQL查询删除重复的行。
我的意图是使用两个查询。一个用于将所有重复行插入到归档表中,另一个用于删除重复行。如果它可以以某种方式被制成一个查找重复项的查询,将它们插入到存档表中,然后将它们删除 - 所有这些都在一次运行中,这样会更好。
对这个领域不熟悉,任何帮助或指导都将不胜感激。
答案 0 :(得分:0)
“列数与第1行的值计数不匹配”
表enrollees_duplicates和enrollees具有不同的结构。
使用ON DELETE TRIGGER可能更好吗? (http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html)。
答案 1 :(得分:0)
我的问题的解决方案是,当我的第一个选择只是'*'时,它将两个额外的列(MinId,count)添加到结果中,这使得列数不同。通过仅抓取'enrollees'表的结果而不是子查询的其他参数,它会纠正列差异。
INSERT INTO enrollees_duplicates (SELECT enrollees.*
FROM enrollees
INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
ON enrollees.first_name = b.first_name
AND enrollees.last_name = b.last_name
AND enrollees.address1 = b.address1
AND enrollees.city = b.city
AND enrollees.state = b.state
AND enrollees.zip = b.zip
AND count > 1
AND enrollees.program_instance_id = b.program_instance_id
AND enrollees.id != MinId);