我正在尝试使用重复记录清理数据库。我需要将引用移动到单个记录并删除另一个记录。
我有两个表:促销员和场地,每个表都有一个名为cities的表的引用。问题是有些城市名称相同,ID不同,与场地和推广人有关系。
通过此查询,我可以将所有发起人和场地分组为一个城市记录:
SELECT c.id as id, c.name as name, GROUP_CONCAT( DISTINCT p.id ) as promoters_ids, GROUP_CONCAT( DISTINCT v.id ) as venues_ids
FROM cities as c
LEFT JOIN promoters as p ON p.city_id = c.id
LEFT JOIN venues as v ON v.city_id = c.id
WHERE c.name IN ( SELECT name from cities group by name having count(cities.name) > 1 )
GROUP BY c.name
现在我想在启动器上运行UPDATE查询,将city_id设置为上面查询的结果。
这样的事情:
UPDATE promoters AS pr SET pr.city_id = (
SELECT ID
FROM (
SELECT c.id as id, c.name as name, GROUP_CONCAT( DISTINCT p.id ) as promoters_ids
FROM cities as c
LEFT JOIN promoters as p ON p.city_id = c.id
WHERE c.name IN ( SELECT name from cities group by name having count(cities.name) > 1 ) AND pr.id IN promoters_ids
GROUP BY c.name
) AS T1
)
我该怎么做?
由于
答案 0 :(得分:3)
如果我理解正确,您希望删除重复的城市(最后),因此您需要更新链接到您要在该过程中删除的任何城市的推广者。
我认为使用具有相同名称的任何城市的最低ID是有意义的(也可能是最高的,但我想至少指定它,并且不要留给我。
因此,为了获得推广者的正确ID,我需要:选择与已经链接到推广者的城市同名的所有城市的最低ID。
幸运的是,这种需求非常适合查询:
UPDATE promoters AS pr
SET pr.city_id = (
SELECT
-- Select the lowest ID ..
Min(c.id)
FROM
-- .. of all cities ..
Cities c
-- .. that have the same name ..
INNER JOIN Cities pc on pc.Name = c.Name
WHERE
.. as the city already linked to the promoter being updated
pc.id = pr.city_id
GROUP BY
c.name)
诀窍是按名称加入城市,这样您就可以轻松获得所有同名城市。我认为你对IN
子句做了同样的尝试,但这比它需要的要复杂一些。
我认为你根本不需要group_concat
,除了检查inned查询是否确实返回了正确的城市,尽管它没有意义,因为你已经在对名称进行分组。如果这样写,你可以说这不会出错:
SELECT
-- Select the lowest ID ..
MIN(c.id) AS id,
GROUP_CONCAT(c.name) AS names --< already grouped by this, so why...
FROM
-- .. of all cities ..
Cities c
-- .. that have the same name.
INNER JOIN Cities pc on pc.Name = c.Name
GROUP BY
c.name
我希望我能正确理解这个问题。