使用Checksum查找不存在的行

时间:2015-08-07 16:00:02

标签: sql sql-server

请参阅下面的DDL:

create table #dbNames (reference int identity not null, name1 varchar(30), 
                       name2 varchar(30), dateadded datetime, primary key (reference))
insert into #dbNames ('Bert', 'Claire', '2010-01-01')
insert into #dbNames ('Claire', 'Bert', '2015-01-01')

我希望查询的输出为:

Claire Bert 2015-01-01

2015-01-01比2010-01-01更新。 Claire和Bert只需要在行上显示一次,即最近一行。

我正在考虑创建Checksum列。但是,Checksum产生两个不同的值:

declare @checksum int
declare @checksum2 int

set @Checksum = checksum('Claire,Bert')
set @Checksum2 = checksum('Bert,Claire')

print @checksum
print @checksum2

是否有我可以使用的算法,因此值相同,即在上面,@ Checksum和@ Checksum2会产生不同的结果。

1 个答案:

答案 0 :(得分:0)

试试这个:

insert into #dbNames
            (
                name1
            ,   name2
            ,   dateadded
            )
    VALUES  ('Bert', 'Claire', '2010-01-01')
        ,   ('Claire', 'Bert', '2015-01-01')
        ,   ('John', 'Smith', '2015-01-02')
        ,   ('Smith', 'John', '2015-02-01')
        ,   ('Joe', 'Blow', '2015-03-01')

;WITH
    cte1 AS
    (
        SELECT      *,
                    full_name = name1 + ' ' + name2,
                    alt_name  = name2 + ' ' + name1
        FROM        #dbNames
    ),
    cte2 AS
    (
        SELECT      c1.*,
                    final_name =
                        CASE 
                            WHEN c2.reference IS NULL THEN c1.full_name
                            WHEN c1.dateadded >= c2.dateadded THEN c1.full_name
                            ELSE c2.full_name
                        END
        FROM        cte1    c1
        LEFT JOIN   cte1    c2  ON (c1.full_name = c2.full_name OR c1.full_name = c2.alt_name)
                                AND c1.reference <> c2.reference
    ),
    cte3 AS
    (
        SELECT      *,
                    rownumber = ROW_NUMBER() OVER (PARTITION BY final_name ORDER BY dateadded)
        FROM        cte2
    )

SELECT *
FROM cte3
WHERE rownumber = 1

我还没有在数百万行上测试它。