通过考虑现有列值,如何在SQL SERVER中从平面文件插入一组行时避免重复行

时间:2014-06-20 10:28:56

标签: sql-server

我有一个包含相同RecordtypeCode的行集的表,

enter image description here

然后单个/ set行来自flatfile /其他来源,如下所示, enter image description here

最后,我需要在表格中通过重复的Recordtypecode&取其他字段信息的最大值,

最后我的桌子应该是这样的, enter image description here

我现在尝试了什么?      从我的桌子中获取所有行&然后与新的记录集合并编写存储过程(使用group by& max关键字)以获得临时表中的所需输出&最后截断了我的桌子和然后将临时表数据插入到我的表中。

是否还有其他更好的方法可以避免性能问题,因为我将在这里播放数百万条记录。

2 个答案:

答案 0 :(得分:0)

如果没有更多详细信息,很难回答,但您可以尝试这样的方法来获得分组结果:

SELECT RecordTypeCode, 
       Max(AgeGroupFemale60_64), 
       Max(AgeGroupFemale65_69), 
       Max(AgeGroupFemale70_74)
FROM [TempTable]
GROUP BY RecordTypeCode

答案 1 :(得分:0)

假设您使用的是SQL Server 2005+,可以使用MAX() OVER来确定每个Recordtypecode组中的最大标记值:

SELECT
  Recordtypecode,
  AgeGroupFemale60_64,
  AgeGroupFemale65_69,
  AgeGroupFemale70_74,
  MAX(AgeGroupFemale60_64) OVER (PARTITION BY Recordtypecode),
  MAX(AgeGroupFemale65_69) OVER (PARTITION BY Recordtypecode),
  MAX(AgeGroupFemale70_74) OVER (PARTITION BY Recordtypecode)
FROM
  dbo.TempTable

并使用这些值更新所有标志:

WITH maximums AS (
  SELECT
    Recordtypecode,
    AgeGroupFemale60_64,
    AgeGroupFemale65_69,
    AgeGroupFemale70_74,
    MaxFemale60_64 = MAX(AgeGroupFemale60_64) OVER (PARTITION BY Recordtypecode),
    MaxFemale65_69 = MAX(AgeGroupFemale65_69) OVER (PARTITION BY Recordtypecode),
    MaxFemale70_74 = MAX(AgeGroupFemale70_74) OVER (PARTITION BY Recordtypecode)
  FROM
    dbo.TempTable
)
UPDATE
  maximums
SET
  AgeGroupFemale60_64 = MaxFemale60_64,
  AgeGroupFemale65_69 = MaxFemale65_69,
  AgeGroupFemale70_74 = MaxFemale70_74
;

接下来,您可以使用ROW_NUMBER()枚举组中的所有行:

SELECT
  *
  rn = ROW_NUMBER() OVER (PARTITION BY Recordtypecode ORDER BY Recordtypecode)
FROM
  dbo.TempTable

并删除rn > 1的所有行:

WITH enumerated AS (
  SELECT
    *
    rn = ROW_NUMBER() OVER (PARTITION BY Recordtypecode ORDER BY Recordtypecode)
  FROM
    dbo.TempTable
)
DELETE FROM
  enumerated
WHERE
  rn > 1
;

或者,代替两个语句UPDATEDELETE,您可以使用一个MERGE(现在假设SQL Server 2008+),如下所示:

WITH enumerated AS (
  SELECT
    *
    rn = ROW_NUMBER() OVER (PARTITION BY Recordtypecode ORDER BY Recordtypecode)
  FROM
    dbo.TempTable
),
maximums AS (
  SELECT
    Recordtypecode,
    MaxFemale60_64 = MAX(AgeGroupFemale60_64),
    MaxFemale65_69 = MAX(AgeGroupFemale65_69),
    MaxFemale70_74 = MAX(AgeGroupFemale70_74),
    rn = 1
  FROM
    dbo.TempTable
  GROUP BY
    Recordtypecode
)
MERGE INTO
  enumerated AS tgt
USING
   maximums AS src
ON
  tgt.Recordtypecode = src.Recordtypecode AND tgt.rn = src.rn
WHEN MATCHED THEN
  UPDATE SET
    tgt.AgeGroupFemale60_64 = src.MaxFemale60_64,
    tgt.AgeGroupFemale65_69 = src.MaxFemale65_69,
    tgt.AgeGroupFemale70_74 = src.MaxFemale70_74
WHEN NOT MATCHED THEN
  DELETE
;

更多信息:

  1. OVER Clause (Transact-SQL)

  2. MERGE (Transact-SQL)

    请注意,在决定使用它之前,您需要了解MERGE语句的已知问题。您可以从本文开始了解有关它们的更多信息,并了解它们是否适用于您的情况: