我有三个表:
有时候我知道学生转学到不同的学校,在这种情况下,我只是将Student.SchoolID
更改为新的SchoolID
。
我经常不发现一个学生已经转学了,在这种情况下,我最终得到了两个具有相同的名字,姓氏,出生日期但学校ID不同的学生。我将这些学生称为重复学生(索引可防止所有四个字段都相同)。发生这种情况时,我必须转到StudentTest
表并更改StudentTest.StudentID
以匹配新的StudentID
。将所有测试转移到“新”学生后,我可以删除旧的学生记录。
在清理35,000名学生的SQL Server数据库时,我重复了1600名学生,并与重复的学生进行了不同数量的测试。
我可以选择重复学生的测验,以便将它们组合在一起,形成一个视图(vwTestList
,该视图将一些学生信息与测验信息相结合):
SELECT
a.SchoolID as SchID, a.StudentID as StudID, a.StudentTestID as TestID,
a.LastName as LName, a.FirstName as FName,
a.Birthdate as BDate, a.Testdate as Tdate, a.Grade
FROM
dbo.vwTestList a
JOIN
(SELECT
firstname, lastname, BirthDate, SchoolID
FROM
dbo.vwTestList
GROUP BY
firstname, lastname, BirthDate, SchoolID
HAVING
COUNT(*) > 1) b ON a.firstname = b.firstname
AND a.lastname = b.lastname
AND a.BirthDate = b.BirthDate
AND a.SchoolID <> b.SchoolID
ORDER BY
a.LastName, a.FirstName, a.BirthDate, a.Testdate DESC
样本结果:
SchID StudID TestID LName FName Bdate TDate Grade
----------------------------------------------------------------------
461 16172 142773 Auk Jay 2000-06-29 2010-04-13 4.7
461 16172 136350 Auk Jay 2000-06-29 2009-04-14 3.7
146 5234 128517 Auk Jay 2000-06-29 2008-04-01 2.7
146 5234 123560 Auk Jay 2000-06-29 2007-04-10 1.7
但是,我无法找出一个更新查询来将每组重复学生的所有测试更改为该组中最新测试的StudentID
。在此示例中,Jay Auk的所有测试都应以StudentID
结束于16172。任何帮助将不胜感激!
答案 0 :(得分:0)
查看以下示例,看看这是否对您有所帮助。此示例可以在SSMS中运行。
-- replicate environment --
DECLARE @TestView TABLE (
SchID INT, StudID INT, TestID INT, LName VARCHAR(30), FName VARCHAR(30), BDate DATETIME, TDate DATETIME, Grade DECIMAL(10,1)
)
INSERT INTO @TestView (
SchID, StudID, TestID, LName, FName, BDate, TDate, Grade
)
VALUES
( 461, 16172, 142773, 'Auk', 'Jay', '2000-06-29', '2010-04-13', 4.7 )
, ( 461, 16172, 136350, 'Auk', 'Jay', '2000-06-29', '2009-04-14', 3.7 )
, ( 146, 5234, 128517, 'Auk', 'Jay', '2000-06-29', '2008-04-01', 2.7 )
, ( 146, 5234, 123560, 'Auk', 'Jay', '2000-06-29', '2007-04-10', 1.7 )
, ( 152, 17899, 123561, 'Gates', 'Bill', '1955-10-28', '2007-04-15', 4.7 )
, ( 152, 17899, 123562, 'Gates', 'Bill', '1955-10-28', '2007-04-14', 3.7 )
, ( 157, 5235, 123563, 'Gates', 'Bill', '1955-10-28', '2007-04-01', 2.7 )
, ( 157, 5235, 123564, 'Gates', 'Bill', '1955-10-28', '2007-04-10', 1.7 );
-- starting data --
SELECT * FROM @TestView;
返回:
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
| SchID | StudID | TestID | LName | FName | BDate | TDate | Grade |
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
| 461 | 16172 | 142773 | Auk | Jay | 2000-06-29 00:00:00.000 | 2010-04-13 00:00:00.000 | 4.7 |
| 461 | 16172 | 136350 | Auk | Jay | 2000-06-29 00:00:00.000 | 2009-04-14 00:00:00.000 | 3.7 |
| 146 | 5234 | 128517 | Auk | Jay | 2000-06-29 00:00:00.000 | 2008-04-01 00:00:00.000 | 2.7 |
| 146 | 5234 | 123560 | Auk | Jay | 2000-06-29 00:00:00.000 | 2007-04-10 00:00:00.000 | 1.7 |
| 152 | 17899 | 123561 | Gates | Bill | 1955-10-28 00:00:00.000 | 2007-04-15 00:00:00.000 | 4.7 |
| 152 | 17899 | 123562 | Gates | Bill | 1955-10-28 00:00:00.000 | 2007-04-14 00:00:00.000 | 3.7 |
| 157 | 5235 | 123563 | Gates | Bill | 1955-10-28 00:00:00.000 | 2007-04-01 00:00:00.000 | 2.7 |
| 157 | 5235 | 123564 | Gates | Bill | 1955-10-28 00:00:00.000 | 2007-04-10 00:00:00.000 | 1.7 |
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
您可以在上面看到起始值。
继续...
-- update the StudID to the most recent test StudID for student --
UPDATE @TestView
SET
StudID = Tests.MostRecentID
FROM @TestView vwTestList
CROSS APPLY (
SELECT TOP 1 StudID AS MostRecentID FROM @TestView vw
WHERE
vw.LName = vwTestList.LName
AND vw.FName = vwTestList.FName
ORDER BY vw.TDate DESC
) AS Tests;
-- view results --
SELECT * FROM @TestView;
返回:
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
| SchID | StudID | TestID | LName | FName | BDate | TDate | Grade |
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
| 461 | 16172 | 142773 | Auk | Jay | 2000-06-29 00:00:00.000 | 2010-04-13 00:00:00.000 | 4.7 |
| 461 | 16172 | 136350 | Auk | Jay | 2000-06-29 00:00:00.000 | 2009-04-14 00:00:00.000 | 3.7 |
| 146 | 16172 | 128517 | Auk | Jay | 2000-06-29 00:00:00.000 | 2008-04-01 00:00:00.000 | 2.7 |
| 146 | 16172 | 123560 | Auk | Jay | 2000-06-29 00:00:00.000 | 2007-04-10 00:00:00.000 | 1.7 |
| 152 | 17899 | 123561 | Gates | Bill | 1955-10-28 00:00:00.000 | 2007-04-15 00:00:00.000 | 4.7 |
| 152 | 17899 | 123562 | Gates | Bill | 1955-10-28 00:00:00.000 | 2007-04-14 00:00:00.000 | 3.7 |
| 157 | 17899 | 123563 | Gates | Bill | 1955-10-28 00:00:00.000 | 2007-04-01 00:00:00.000 | 2.7 |
| 157 | 17899 | 123564 | Gates | Bill | 1955-10-28 00:00:00.000 | 2007-04-10 00:00:00.000 | 1.7 |
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
该StudID已更新为每行的最新Test StudID。加入这个名字使我感到紧张,但是如果您没有另一种独特的方法来保证学生比赛(例如,两个学生的名字相同),那么可能没有很多其他选择。
一如既往,请在实际运行更新之前验证您的数据,但这应该可以帮助您入门。