根据日期更新重复记录,因此没有两个日期相同

时间:2018-12-18 17:54:19

标签: sql sql-server database tsql ssis

Hkey | Observation dt|      Retriment_dt | Name |Code | Masterkey
---------+------------+------
23        10/8/2018        01/01/3030     Sam     XYZ       99
23        10/8/2018        01/01/3030     Sam     XYZ       98
23        10/8/2018        01/01/3030     Sam     XYZ       97
21        11/8/2018        01/01/3030     JOHN   TGI        65 
21        11/8/2018        01/01/3030     JOHN   TGI        64
21        11/8/2018        01/01/3030     JOHN   TGI        63
30        11/8/2018        01/01/3030     Chris  MNY        70

好,所以假设我有这个表并且我的表总数超过一百万,我想为重复的行更新表(Observation dtretirement dt)-我不想更新所有表观察日期为同一日期,但我希望它们一天之内会有所不同。我已经在下面手动输入了。如何在Sql或SSIS或任何编程语言中执行此操作。这是Mssql Db表。我是sql新手,将不胜感激。谢谢!

HKeyObservation_dt的组合是主键,当我应用约束时,它将引发错误,因此我试图通过更改两个retirement_dt来撤消所有重复记录。和observation_dtRetirement dt将是今天的日期,而observation_dt可以是任意一个date-1(每个重复的日期递增)

代码运行时的外观

Hkey | Observation dt|      Retriment_dt | Name |Code | Masterkey
---------+------------+------
23        10/8/2018        01/01/3030     Sam     XYZ       99
23        10/7/2018        12/17/2018     Sam     XYZ       98
23        10/6/2018        12/17/2018     Sam     XYZ       97
21        11/8/2018        01/01/3030     JOHN   TGI        65 
21        11/7/2018        12/17/2018     JOHN   TGI        64
21        11/6/2018        12/17/2018     JOHN   TGI        63
30        11/8/2018        01/01/3030     Chris  MNY        70

3 个答案:

答案 0 :(得分:0)

您可以使用以下解决方案:

IF OBJECT_ID('tempdb..#YourTable') IS NOT NULL
    DROP TABLE #YourTable

SELECT
    V.Hkey,
    [Observation dt] = CONVERT(DATE, V.[Observation dt]),
    [Retriment_dt] = CONVERT(DATE, V.[Retriment_dt])
INTO
    #YourTable
FROM
    (VALUES
    (23,'2018-08-10','3030-01-01'),
    (23,'2018-08-10','3030-01-01'),
    (23,'2018-08-10','3030-01-01'),
    (21,'2018-08-10','3030-01-01'),
    (21,'2018-08-10','3030-01-01'),
    (21,'2018-08-10','3030-01-01'),
    (30,'2018-08-10','3030-01-01')) V(Hkey, [Observation dt], [Retriment_dt])

;WITH DuplicateRecords AS
(
    SELECT
        T.HKey,
        T.[Observation dt]
    FROM
        #YourTable T
    GROUP BY
        T.HKey,
        T.[Observation dt]
    HAVING
        COUNT(1) > 1
),
RowNumber AS
(
    SELECT
        T.Hkey,
        T.[Observation dt],
        T.[Retriment_dt],
        RowNumberByHkey = ROW_NUMBER() OVER (PARTITION BY T.Hkey ORDER BY T.[Observation dt], T.[Retriment_dt])
    FROM
        #YourTable AS T
        INNER JOIN DuplicateRecords AS D ON
            T.Hkey = D.Hkey AND
            T.[Observation dt] = D.[Observation dt]
),
UpdatedValues AS
(
    SELECT
        R.Hkey,
        R.[Observation dt],
        R.[Retriment_dt],
        NewObservationDT = DATEADD(
            DAY,
            -1 * (R.RowNumberByHkey - 1),
            R.[Observation dt]),
        NewRetirementDT = GETDATE(),
        R.RowNumberByHkey
    FROM
        RowNumber AS R
),
RecordsToUpdate AS
(
    -- Need a row number to be able to update correctly, since the record is duplicated (need an ID to join)
    SELECT
        T.Hkey,
        T.[Observation dt],
        T.[Retriment_dt],
        RowNumberByHkey = ROW_NUMBER() OVER (PARTITION BY T.Hkey ORDER BY T.[Observation dt], T.[Retriment_dt])
    FROM
        #YourTable AS T
)
UPDATE T SET
    [Observation dt] = R.NewObservationDT,
    [Retriment_dt] = R.NewRetirementDT
FROM
    RecordsToUpdate AS T
    INNER JOIN UpdatedValues AS R ON
        T.HKey = R.HKey AND
        T.[Observation dt] = R.[Observation dt] AND
        T.RowNumberByHkey = R.RowNumberByHkey




SELECT 
    * 
FROM 
    #YourTable AS T 
ORDER BY 
    T.Hkey, 
    T.[Observation dt] DESC

结果:

Hkey    Observation dt  Retriment_dt
21      2018-08-10      2018-12-18
21      2018-08-09      2018-12-18
21      2018-08-08      2018-12-18
23      2018-08-10      2018-12-18
23      2018-08-09      2018-12-18
23      2018-08-08      2018-12-18
30      2018-08-10      3030-01-01

这有点棘手,因为您需要更新每个记录具有不同值的重复记录,因此您需要生成某种唯一的ID(我使用行号)来匹配它们。

生成不同日期的方法是应用行号为DATEADD的{​​{1}}。这会产生不同的日期,相差1天。

答案 1 :(得分:0)

使用温度表:

Create Table #tbl
(
hkey Int,
Observation Date,
Retriment Date
)
Insert Into #tbl Values
(23,'2018-10-08','3030-01-01'),
(23,'2018-10-08','3030-01-01'),
(23,'2018-10-08','3030-01-01'),
(21,'2018-11-08','3030-01-01'),
(21,'2018-11-08','3030-01-01'),
(21,'2018-11-08','3030-01-01'),
(30,'2018-11-08','3030-01-01')


Select Row_Number() OVER(Order By (Select Null)) As raworder,*  Into #temp From #tbl

Select hkey,
        DateAdd(Day,-Row_Number() Over (Partition By hkey Order By hkey)+1 , Observation) As newDT,  
        Case When (Row_Number() Over (Partition By hkey Order By hkey) = 1) Then Retriment Else Convert(Date,GetDate()) End As Retriment
    From #temp
   Order By raworder

结果:

hkey    newDT       Retriment
23      2018-10-08  3030-01-01
23      2018-10-07  2018-12-18
23      2018-10-06  2018-12-18
21      2018-11-08  3030-01-01
21      2018-11-07  2018-12-18
21      2018-11-06  2018-12-18
30      2018-11-08  3030-01-01

答案 2 :(得分:0)

我的同事以类似的方式进行了此操作,但是感谢您的答复。我已经发布了使用的代码。

SELECT [healthplanentryhistory_avi_hkey]
    ,[effective_date]
    ,[expiration_date]
    ,[healthplanentryhistoryid]
    ,[hospitalmasterid]
    ,[plancode]
    ,[plangeneration]
    ,[code]
    ,[pawvalue]
    ,[quantitycoveredbyplan]
    ,[healthplanentrymasterid]
    ,[healthplanentryid]
    ,[healthplanid]
    ,[lastupdate]
    ,[origpawvalue]
    ,[active_ind]
    ,[hash_diff]
    ,[source_sys_id]
    ,[create_date]
    ,[update_date]
    ,cnt
    ,Rank
INTO ##tmphph
FROM (
    SELECT *
        ,COUNT(*) OVER (PARTITION BY [healthplanentryhistory_avi_hkey]) AS cnt
        ,RANK() OVER (
            PARTITION BY [healthplanentryhistory_avi_hkey] ORDER BY healthplanentryhistoryid DESC
            ) AS Rank
    FROM [atf_healthplanentryhistory_avi]
    ) AS t
WHERE t.cnt > 1
    AND t.rank > 1
ORDER BY healthplanentryhistoryid DESC;

---SELECT * FROM ##tmphph where healthplanentryhistory_avi_hkey = 0x039E7D809F8138B703FC9991E9D8F655
MERGE INTO [atf_healthplanentryhistory_avi] atf
USING ##tmphph TEMP
    ON atf.healthplanentryhistory_avi_hkey = TEMP.[healthplanentryhistory_avi_hkey]
        AND atf.effective_date = TEMP.effective_date
        AND atf.healthplanentryhistoryid = TEMP.healthplanentryhistoryid
        AND TEMP.rank > 1
WHEN MATCHED
    THEN
        UPDATE
        SET atf.effective_date = getdate() - TEMP.rank /*This will update the effective_date to efective_date - rank#*/
            ,expiration_date = getdate() - TEMP.rank
            ,active_ind = 0;

DROP TABLE ##tmphph