我想清理一个混乱的数据库,并替换对重复条目的引用。在这个定制的示例中(我的要复杂得多),我有两个表:
我们知道:
colors
中包含重复项color_id
不同。我解决此问题的方式涉及TEMPORARY
表。为避免该错误:
Can't Reopen Table 'duplicates'
我只是多次复制我的TEMPORARY
表:
CREATE TEMPORARY TABLE duplicates1 SELECT * FROM duplicates;
CREATE TEMPORARY TABLE duplicates2 SELECT * FROM duplicates;
我想避免克隆TEMPORARY
表。
CREATE TABLE `test`.`octopuses` (
`id` INT NOT NULL AUTO_INCREMENT,
`name` VARCHAR(45) NOT NULL,
`color_id` INT NOT NULL,
PRIMARY KEY (`id`));
CREATE TABLE `test`.`colors` (
`id` INT NOT NULL AUTO_INCREMENT,
`name` VARCHAR(45) NOT NULL,
PRIMARY KEY (`id`));
某些颜色重复:
INSERT INTO colors (name) VALUES
('cream'), ('sepia'), ('daffodil'), ('lipstick'),
('lipstick'), ('garnet'), ('flamingo'), ('navy'),
('chartreuse'), ('garnet'), ('flamingo'), ('juniper'),
('flint'), ('flint'), ('charcoal'), ('garnet');
还有一些章鱼:
INSERT INTO octopuses (name, color_id) VALUES
('Bubbles', 1), ('Inky', 8), ('Octavius', 1),
('Sir Inks-A-Lot', 7), ('Octavia', 16), ('Kraken', 6),
('Oncho', 15), ('Big Floppy Sea Spider', 14), ('Calamari', 2),
('Scuba Doo', 13), ('Squidward Tentacles', 5), ('Wiggleton', 9),
('Cthulhu', 2), ('Octopussy', 3), ('Triton', 10),
('Doctor Octopus', 11), ('Billy The Squid', 4), ('Stretch', 12);
要解决此问题,我首先创建重复列表:
CREATE TEMPORARY TABLE duplicates SELECT
*, COUNT(*) AS count
FROM
colors
GROUP BY name
HAVING count > 1;
这里是:
mysql> select * FROM duplicates;
+----+----------+-------+
| id | name | count |
+----+----------+-------+
| 4 | lipstick | 2 |
| 6 | garnet | 3 |
| 7 | flamingo | 2 |
| 13 | flint | 2 |
+----+----------+-------+
然后我想创建一个对应的表,其中我有一个重复的id
和一个id
要替换为:
CREATE TEMPORARY TABLE duplicates1 SELECT * FROM duplicates;
CREATE TEMPORARY TABLE duplicates2 SELECT * FROM duplicates;
CREATE TEMPORARY TABLE corresponding SELECT
id, name,
(SELECT
id
FROM
duplicates2
WHERE
duplicates2.name = colors.name) AS first_id
FROM
colors
WHERE
name IN (SELECT
name
FROM
duplicates)
AND id NOT IN (SELECT
id
FROM
duplicates1)
ORDER BY name ASC;
这里的内容:
mysql> SELECT * FROM corresponding;
+----+----------+----------+
| id | name | first_id |
+----+----------+----------+
| 11 | flamingo | 7 |
| 14 | flint | 13 |
| 10 | garnet | 6 |
| 16 | garnet | 6 |
| 5 | lipstick | 4 |
+----+----------+----------+
然后我只需更新octopuses
表:
CREATE TEMPORARY TABLE corresponding1 SELECT * FROM corresponding;
UPDATE octopuses
SET
color_id = (SELECT
first_id
FROM
corresponding1
WHERE
corresponding1.id = color_id)
WHERE
color_id IN (SELECT
id
FROM
corresponding)
最终,我删除了重复项:
DELETE FROM colors WHERE id IN (SELECT id FROM corresponding);
这个例子也许不是最好的例子来说明我的问题,但是在这里我想避免克隆临时表并找到一种在IN
表上使用多个TEMPORARY
条件进行选择的方法。
答案 0 :(得分:0)
尝试反过来思考。
您可以这样做:
UPDATE octopuses
INNER JOIN
(SELECT
*,
(SELECT
id
FROM
colors
WHERE
colors.name = (SELECT
name
FROM
colors
WHERE
color_id = colors.id)
LIMIT 1) AS first_color_id
FROM
octopuses
HAVING color_id <> first_color_id) AS DUP ON dup.color_id = octopuses.color_id
SET
octopuses.color_id = first_color_id
WHERE
octopuses.color_id <> first_color_id;
CREATE TEMPORARY TABLE to_delete SELECT id FROM colors WHERE NOT EXISTS (
SELECT id FROM octopuses WHERE color_id = colors.id
);
DELETE FROM colors WHERE id IN (SELECT id FROM to_delete);
所以您的问题的答案是:
每当需要克隆临时表时,请三思而后行,您会发现另一种方法,该方法不涉及两次重新打开临时表!