SQL Server上的复杂更新查询

时间:2018-08-28 15:22:26

标签: sql-server duplicates sql-update

我有三个表:

  • 学校:SchoolID,SchoolName,城市,州,...
  • 学生:学生ID,学校ID(fk),名字,姓氏,生日,...
  • StudentTest :StudentTestID,StudentID(fk),TestDate,成绩,...

有时候我知道学生转学到不同的学校,在这种情况下,我只是将Student.SchoolID更改为新的SchoolID

我经常发现一个学生已经转学了,在这种情况下,我最终得到了两个具有相同的名字,姓氏,出生日期但学校ID不同的学生。我将这些学生称为重复学生(索引可防止所有四个字段都相同)。发生这种情况时,我必须转到StudentTest表并更改StudentTest.StudentID以匹配新的StudentID。将所有测试转移到“新”学生后,我可以删除旧的学生记录。

在清理35,000名学生的SQL Server数据库时,我重复了1600名学生,并与重复的学生进行了不同数量的测试。

我可以选择重复学生的测验,以便将它们组合在一起,形成一个视图(vwTestList,该视图将一些学生信息与测验信息相结合):

SELECT 
    a.SchoolID as SchID, a.StudentID as StudID, a.StudentTestID as TestID, 
    a.LastName as LName, a.FirstName as FName, 
    a.Birthdate as BDate, a.Testdate as Tdate, a.Grade
FROM 
    dbo.vwTestList a
JOIN 
    (SELECT 
         firstname, lastname, BirthDate, SchoolID
     FROM 
         dbo.vwTestList
     GROUP BY 
         firstname, lastname, BirthDate, SchoolID
     HAVING 
         COUNT(*) > 1) b ON a.firstname = b.firstname
                         AND a.lastname = b.lastname
                         AND a.BirthDate = b.BirthDate
                         AND a.SchoolID <> b.SchoolID
ORDER BY 
    a.LastName, a.FirstName, a.BirthDate, a.Testdate DESC

样本结果:

SchID   StudID  TestID LName    FName   Bdate       TDate      Grade
----------------------------------------------------------------------
461     16172   142773  Auk     Jay     2000-06-29  2010-04-13  4.7 
461     16172   136350  Auk     Jay     2000-06-29  2009-04-14  3.7 
146     5234    128517  Auk     Jay     2000-06-29  2008-04-01  2.7 
146     5234    123560  Auk     Jay     2000-06-29  2007-04-10  1.7

但是,我无法找出一个更新查询来将每组重复学生的所有测试更改为该组中最新测试的StudentID。在此示例中,Jay Auk的所有测试都应以StudentID结束于16172。任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:0)

查看以下示例,看看这是否对您有所帮助。此示例可以在SSMS中运行。

-- replicate environment --

DECLARE @TestView TABLE (
    SchID INT, StudID INT, TestID INT, LName VARCHAR(30), FName VARCHAR(30), BDate DATETIME, TDate DATETIME, Grade DECIMAL(10,1)
)

INSERT INTO @TestView (
    SchID, StudID, TestID, LName, FName, BDate, TDate, Grade
)
VALUES
  ( 461, 16172, 142773, 'Auk', 'Jay', '2000-06-29', '2010-04-13', 4.7 )
, ( 461, 16172, 136350, 'Auk', 'Jay', '2000-06-29', '2009-04-14', 3.7 )
, ( 146, 5234, 128517, 'Auk', 'Jay', '2000-06-29', '2008-04-01', 2.7 )
, ( 146, 5234, 123560, 'Auk', 'Jay', '2000-06-29', '2007-04-10', 1.7 )
, ( 152, 17899, 123561, 'Gates', 'Bill', '1955-10-28', '2007-04-15', 4.7 )
, ( 152, 17899, 123562, 'Gates', 'Bill', '1955-10-28', '2007-04-14', 3.7 )
, ( 157, 5235, 123563, 'Gates', 'Bill', '1955-10-28', '2007-04-01', 2.7 )
, ( 157, 5235, 123564, 'Gates', 'Bill', '1955-10-28', '2007-04-10', 1.7 );

-- starting data --

SELECT * FROM @TestView;

返回:

+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
| SchID | StudID | TestID | LName | FName |          BDate          |          TDate          | Grade |
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
|   461 |  16172 | 142773 | Auk   | Jay   | 2000-06-29 00:00:00.000 | 2010-04-13 00:00:00.000 | 4.7   |
|   461 |  16172 | 136350 | Auk   | Jay   | 2000-06-29 00:00:00.000 | 2009-04-14 00:00:00.000 | 3.7   |
|   146 |   5234 | 128517 | Auk   | Jay   | 2000-06-29 00:00:00.000 | 2008-04-01 00:00:00.000 | 2.7   |
|   146 |   5234 | 123560 | Auk   | Jay   | 2000-06-29 00:00:00.000 | 2007-04-10 00:00:00.000 | 1.7   |
|   152 |  17899 | 123561 | Gates | Bill  | 1955-10-28 00:00:00.000 | 2007-04-15 00:00:00.000 | 4.7   |
|   152 |  17899 | 123562 | Gates | Bill  | 1955-10-28 00:00:00.000 | 2007-04-14 00:00:00.000 | 3.7   |
|   157 |   5235 | 123563 | Gates | Bill  | 1955-10-28 00:00:00.000 | 2007-04-01 00:00:00.000 | 2.7   |
|   157 |   5235 | 123564 | Gates | Bill  | 1955-10-28 00:00:00.000 | 2007-04-10 00:00:00.000 | 1.7   |
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+

您可以在上面看到起始值。

继续...

-- update the StudID to the most recent test StudID for student --

UPDATE @TestView
SET
    StudID = Tests.MostRecentID
FROM @TestView vwTestList
CROSS APPLY (

    SELECT TOP 1 StudID AS MostRecentID FROM @TestView vw 
    WHERE 
        vw.LName = vwTestList.LName 
        AND vw.FName = vwTestList.FName 
    ORDER BY vw.TDate DESC

) AS Tests;

-- view results --

SELECT * FROM @TestView;

返回:

+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
| SchID | StudID | TestID | LName | FName |          BDate          |          TDate          | Grade |
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+
|   461 |  16172 | 142773 | Auk   | Jay   | 2000-06-29 00:00:00.000 | 2010-04-13 00:00:00.000 | 4.7   |
|   461 |  16172 | 136350 | Auk   | Jay   | 2000-06-29 00:00:00.000 | 2009-04-14 00:00:00.000 | 3.7   |
|   146 |  16172 | 128517 | Auk   | Jay   | 2000-06-29 00:00:00.000 | 2008-04-01 00:00:00.000 | 2.7   |
|   146 |  16172 | 123560 | Auk   | Jay   | 2000-06-29 00:00:00.000 | 2007-04-10 00:00:00.000 | 1.7   |
|   152 |  17899 | 123561 | Gates | Bill  | 1955-10-28 00:00:00.000 | 2007-04-15 00:00:00.000 | 4.7   |
|   152 |  17899 | 123562 | Gates | Bill  | 1955-10-28 00:00:00.000 | 2007-04-14 00:00:00.000 | 3.7   |
|   157 |  17899 | 123563 | Gates | Bill  | 1955-10-28 00:00:00.000 | 2007-04-01 00:00:00.000 | 2.7   |
|   157 |  17899 | 123564 | Gates | Bill  | 1955-10-28 00:00:00.000 | 2007-04-10 00:00:00.000 | 1.7   |
+-------+--------+--------+-------+-------+-------------------------+-------------------------+-------+

该StudID已更新为每行的最新Test StudID。加入这个名字使我感到紧张,但是如果您没有另一种独特的方法来保证学生比赛(例如,两个学生的名字相同),那么可能没有很多其他选择。

一如既往,请在实际运行更新之前验证您的数据,但这应该可以帮助您入门。