我有一对多的关系,我已经转变为多对多的关系。
示例:
Main Table (
Id int,
Code varchar(2)
)
Secondary Table (
Id int,
Name varchar(250),
MainId int
)
我在主表中有以下条目:
Id Code
1 A
2 B
3 C
辅助表:
Id Name MainId
1 Foo 1
2 Bar 1
3 Foo 2
4 Bar 2
5 Bar 3
由于“'名称”列中的值在'中学'表格经常重复,数据库大小已经大大增加,我决定转换成多对多的关系,并且只引用唯一的名称'条目。
作为第一步,我创建了以下连接表:
MainSecondary Table (
MainId int,
SecondaryId int,
)
对于最后一步,我需要更新现有的引用,并根据'名称'删除重复的记录。专栏,这是我被困的地方(超过一百万条记录)。
预期结果应为:
主表:
Id Code
1 A
2 B
3 C
辅助表:
Id Name
1 Foo
2 Bar
MainSecondary表:
MainId SecondaryId
1 (A) 1 (Foo)
1 (A) 2 (Bar)
2 (B) 1 (Foo)
2 (B) 2 (Bar)
3 (C) 1 (Foo)
答案 0 :(得分:1)
建立
create table main
(
id int,
code varchar(2)
);
create table secondary
(
id int,
name varchar(250),
main_id int
);
insert into main (id, code) values (1, 'A');
insert into main (id, code) values (2, 'B');
insert into main (id, code) values (3, 'C');
insert into secondary (id, name, main_id) values (1, 'Foo', 1);
insert into secondary (id, name, main_id) values (2, 'Bar', 1);
insert into secondary (id, name, main_id) values (3, 'Foo', 2);
insert into secondary (id, name, main_id) values (4, 'Bar', 2);
insert into secondary (id, name, main_id) values (5, 'Bar', 3);
创建new_secondary表
create table new_secondary
(
id int,
name varchar(250)
);
创建新的关系表:main_secondary
create table main_secondary
(
main_id int,
secondary_id int
);
填充new_secondary表,删除重复项
insert into new_secondary
(
id,
name
)
select
min(id),
name
from
secondary
group by
name;
填充main_secondary关系表
insert into main_secondary
(
main_id,
secondary_id
)
select distinct
a.main_id,
b.id as secondary_id
from
secondary a
join
new_secondary b
on a.name = b.name;;
检查结果
select
a.id as main_id,
a.code,
c.id as secondary_id,
c.name
from
main a
join
main_secondary b
on a.id = b.main_id
join
secondary c
on c.id = b.secondary_id;
结果
main_id code secondary_id name
----------- ---- ------------ -------
1 A 1 Foo
2 B 1 Foo
1 A 2 Bar
2 B 2 Bar
3 C 2 Bar
(5 rows affected)
3(C)2(Bar)与您的示例不同,但我认为它是正确的。
您需要删除旧的辅助表并重命名new_secondary表(当您确定一切正常时)以保持整洁。