我有一个名为学生的表,它存储了他们所参加的学生和培训的所有基本信息(该表有超过15列和超过5000条记录)。表格的一个示例部分是这样的:
St_id St_Name St_University SoftSkillTraining StartDate EndDate ComputerTraining StartDate EndDate
---------------------------------------------------------------------------------------------------------------
1 x x True 12/02/2017 12/03/2017 False - -
2 y x True 25/05/2016 25/06/2016 True 01/08/2017
但是,表格没有规范化,我需要将学生表分成三个特定的表格(以多对多的关系形式)
Student
表包含学生的基本信息,如:St_id St_Name St_University St_Faculty -------------------------------------------------- 1 X Some University Law 2 y Some University IT
Training
表存储'培训名称','开始日期'和'结束日期'列 Training
表应该是:
TrainingId TrainingName StartDate EndDate TrainingLocation ----------------------------------------------------------------- 1 SoftSkill 12/02/2017 12/03/2017 Some Location 2 SoftSkill 25/02/2016 25/06/2016 Some Location 3 CMOA 01/08/2017 01//09/2017 some location
intersection
表,仅存储Student
和Training
表的主键作为外键,如下所示:st_id training_id ----------------------- 1 1 2 2 2 1
如何将数据从student
传输到Training
表,因为您可以看到来自student
表的不同列的数据应该显示为training
表中使用存储的行程序?
答案 0 :(得分:0)
实现任务的一种方法:
create table Students (
St_id int primary key,
St_Name varchar(5),
St_University varchar(5),
SoftSkillTraining varchar(5),
ST_StartDate varchar(10),
ST_EndDate varchar(10),
ComputerTraining varchar(5),
CT_StartDate varchar(10),
CT_EndDate varchar(10),
);
insert into Students (St_id, St_Name, St_University, SoftSkillTraining, ST_StartDate , ST_EndDate, ComputerTraining, CT_StartDate, CT_EndDate)
values('1','x', 'x' , 'True' , '12/02/2017', '12/03/2017' , 'False',NULL , NULL)
insert into Students (St_id, St_Name, St_University, SoftSkillTraining, ST_StartDate , ST_EndDate, ComputerTraining, CT_StartDate, CT_EndDate)
values('2' , 'y' ,'x' , 'True' , '25/05/2016' , '25/06/2016' , 'True' , '01/08/2017', NULL)
create table Student (
St_id int primary key,
St_Name varchar(5),
St_University varchar(5),
);
insert into Student (St_id, St_Name,St_University)
select distinct St_id , St_Name , St_University from Students;
create table Training (
Training_Id int identity(1,1) primary key,
Student_Id int foreign key references Students(St_id),
Training_Name varchar(20),
StartDate varchar(10),
EndDate varchar(10),
);
insert into Training (Student_Id ,Training_Name , StartDate, EndDate)
values ('1' , 'SoftSkillTraining' , '12/02/2017' , '12/03/2017' );
insert into Training (Student_Id ,Training_Name , StartDate, EndDate)
values ('2' , 'SoftSkillTraining' , '25/05/2016' , '25/06/2016' );
insert into Training (Student_Id ,Training_Name , StartDate, EndDate)
values ('2' , 'ComputerTraining' , '01/08/2017' , NULL );
create table Intersection (
Intersection_Id int identity(1,1) primary key,
Student_id int foreign key references Students(St_id),
Training_Id int foreign key references Training(Training_id),
);
insert into Intersection (Student_id,Training_Id)
select St_id, Training_Id from Student join Training on St_id = Student_Id
go
create view Participants
as
select St_Name as Participant, Training_Name from Intersection join Student on student_id = St_id join Training on intersection.Training_Id = training.Training_Id
go
答案 1 :(得分:0)
您需要执行相当多的任务,但规范化该表是正确的。在旧表的样本中,我注意到你有[StartDate]和amp; [EndDate]重复。 在SQL Sever中无法实现,所有列名在表中必须是唯一的。我希望这只是样本中的一个小问题,因为它非常重要。
下面我使用一种方法将学生行“取消”转换为多个较短的行,这代表了达到目标的临时步骤。此方法使用CROSS APPLY
和VALUES
。请注意,您需要手动准备此VALUES
部分,但您可以根据信息模式获取查询中的字段列表(此查询未提供)。
MS SQL Server 2014架构设置:
CREATE TABLE Student
([St_id] int, [St_Name] varchar(1), [St_University] varchar(1)
, [SoftSkillTraining] varchar(4), [StartDate1] datetime, [EndDate1] datetime
, [ComputerTraining] varchar(5), [StartDate2] datetime, [EndDate2] datetime)
;
INSERT INTO Student
([St_id], [St_Name], [St_University]
, [SoftSkillTraining], [StartDate1], [EndDate1]
, [ComputerTraining], [StartDate2], [EndDate2])
VALUES
(1, 'x', 'x', 'True', '2017-02-12 00:00:00', '2017-03-12 00:00:00', 'False', NULL, NULL),
(2, 'y', 'x', 'True', '2016-05-25 00:00:00', '2016-06-25 00:00:00', 'True', '2017-08-01', NULL)
;
这是最重要的查询它将源数据“展开”到多行
请注意,如何为每个培训课程分配ID,并且column groups
(例如[SoftSkillTraining], [StartDate1], [EndDate1]
}必须在值区域中逐行指定。这里的每一行都会产生一行新的输出,因此值区域的“布局”基本上决定了最终的输出。在这个区域,您需要仔细收集所有列名并准确排列。
select
St_id, ca.TrainingId, ca.TrainingName, ca.isEnrolled, ca.StartDate, ca.EndDate
into training_setup
from Student
cross apply (
values
(1, 'SoftSkillTraining', [SoftSkillTraining], [StartDate1], [EndDate1])
,(2, 'ComputerTraining', [ComputerTraining], [StartDate2], [EndDate2])
) ca (TrainingId,TrainingName,isEnrolled, StartDate,EndDate)
where ca.isEnrolled = 'True'
;
查询2 :
select
*
from training_setup
<强> Results 强>:
| St_id | TrainingId | TrainingName | isEnrolled | StartDate | EndDate |
|-------|------------|-------------------|------------|----------------------|----------------------|
| 1 | 1 | SoftSkillTraining | True | 2017-02-12T00:00:00Z | 2017-03-12T00:00:00Z |
| 2 | 1 | SoftSkillTraining | True | 2016-05-25T00:00:00Z | 2016-06-25T00:00:00Z |
| 2 | 2 | ComputerTraining | True | 2017-08-01T00:00:00Z | (null) |
查询3 :
-- this can be the basis for table [Training]
select distinct TrainingId,TrainingName, StartDate,EndDate
from training_setup
<强> Results 强>:
| TrainingId | TrainingName | StartDate | EndDate |
|------------|-------------------|----------------------|----------------------|
| 1 | SoftSkillTraining | 2016-05-25T00:00:00Z | 2016-06-25T00:00:00Z |
| 1 | SoftSkillTraining | 2017-02-12T00:00:00Z | 2017-03-12T00:00:00Z |
| 2 | ComputerTraining | 2017-08-01T00:00:00Z | (null) |
注意我对此数据的一致性有所保留,请注意一门课程的开始/结束日期不同。我没有一个简单的解决方案。您可能需要清理数据以最大限度地减少差异和/或您可能需要一个额外的步骤,该步骤与我们在交叉申请中使用的ID以及开始/结束日期对匹配,以通过更新来获得更好的training_id版本在继续之前,training_setup登台表。
查询4 :
-- this can be the basis for table [Student_Training]
select St_id, TrainingId
from training_setup
<强> Results 强>:
| St_id | TrainingId |
|-------|------------|
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |