尝试将重复字段更新为null,其中日期为min和date不为null

时间:2014-12-29 22:19:17

标签: sql sql-server

我正在尝试将license_no字段更新为null 如果 license_no重复并且具有最早的termination_date ,如果该termination_date不是空值。下面的SQL查询是我到目前为止所拥有的。如何确定update语句中重复项的min termination_date?

update driver
set license_no = NULL
where license_no IN (SELECT license_no 
                      FROM   driver 
                      GROUP  BY license_no 
                      HAVING ( Count(license_no) > 1 )) 

3 个答案:

答案 0 :(得分:1)

使用window function。试试这个。

UPDATE A
SET    license_no = NULL
FROM   driver A
       JOIN (SELECT Min(termination_date) termination_date,
                    license_no
             FROM   (SELECT Row_number()OVER(partition BY license_no ORDER BY termination_date desc) rn,
                            *
                     FROM  driver) o
             WHERE  rn > 1
             GROUP  BY license_no) B
         ON a.license_no = b.license_no
            AND a.termination_date = b.termination_date 

答案 1 :(得分:0)

我将使用表变量来获取一些示例数据。我将添加许可证编号和日期的各种组合以捕捉所有可能性。

此外,我将在您的架构中添加并使用ID列。我的解决方案我需要分别识别每一行。仅通过许可证号识别行是不够的。

DECLARE @TDrivers TABLE (ID int NOT NULL PRIMARY KEY, license_no varchar(50) NULL, termination_date date NULL);

insert into @TDrivers values (1, NULL, '20130101');
insert into @TDrivers values (2, NULL, '20130101');
insert into @TDrivers values (3, NULL, '20130102');
insert into @TDrivers values (4, NULL, '20130103');
insert into @TDrivers values (5, NULL, '20130104');

-- only one license with this number
insert into @TDrivers values (6, '01', '20140101');

-- only one license with this number
insert into @TDrivers values (7, '02', NULL);

-- two licenses with this number, different dates
insert into @TDrivers values (8, '03', '20140301');
insert into @TDrivers values (9, '03', '20140302');

-- two licenses with this number, same dates
insert into @TDrivers values (10, '04', '20140401');
insert into @TDrivers values (11, '04', '20140401');

-- two licenses with this number, dates are not null and null
insert into @TDrivers values (12, '05', '20140501');
insert into @TDrivers values (13, '05', NULL);

-- two licenses with this number, both dates are null
insert into @TDrivers values (14, '06', NULL);
insert into @TDrivers values (15, '06', NULL);

-- three licenses with this number, dates are unique and not null
insert into @TDrivers values (16, '07', '20140701');
insert into @TDrivers values (17, '07', '20140702');
insert into @TDrivers values (18, '07', '20140703');

-- three licenses with this number, dates are not unique and not null
insert into @TDrivers values (19, '08', '20140801');
insert into @TDrivers values (20, '08', '20140801');
insert into @TDrivers values (21, '08', '20140802');

-- dates are unique and have one null
insert into @TDrivers values (22, '09', '20140901');
insert into @TDrivers values (23, '09', '20140902');
insert into @TDrivers values (24, '09', '20140903');
insert into @TDrivers values (25, '09', NULL);

-- dates are unique and have several nulls
insert into @TDrivers values (26, '10', '20141001');
insert into @TDrivers values (27, '10', '20141002');
insert into @TDrivers values (28, '10', '20141003');
insert into @TDrivers values (29, '10', NULL);
insert into @TDrivers values (30, '10', NULL);
insert into @TDrivers values (31, '10', NULL);

-- dates are not unique and have one null
insert into @TDrivers values (32, '11', '20141101');
insert into @TDrivers values (33, '11', '20141101');
insert into @TDrivers values (34, '11', '20141102');
insert into @TDrivers values (35, '11', '20141103');
insert into @TDrivers values (36, '11', NULL);

-- dates are not unique and have several nulls
insert into @TDrivers values (37, '12', '20141201');
insert into @TDrivers values (38, '12', '20141201');
insert into @TDrivers values (39, '12', '20141202');
insert into @TDrivers values (40, '12', '20141203');
insert into @TDrivers values (41, '12', NULL);
insert into @TDrivers values (42, '12', NULL);
insert into @TDrivers values (43, '12', NULL);

SELECT * FROM @TDrivers ORDER BY ID;

首先,我们会找到多次出现的所有许可证号码:

WITH
CTE_DuplicatedLicenses
AS
(
    SELECT license_no
    FROM @TDrivers AS d
    WHERE license_no IS NOT NULL
    -- there is no need to update license numbers that are NULL already
    GROUP BY license_no
    HAVING (COUNT(license_no) > 1)
)

然后,对于每个重复的许可证号,我们正在寻找一个具有最小终止日期的行,而不是NULL。要在多个行具有相同的终止日期时解决歧义,请使用ID中的ORDER BY列。

,CTE_LicensesWithDates
AS
(
    SELECT
        CTE_Dates.ID
    FROM
        CTE_DuplicatedLicenses
        CROSS APPLY
        (
            SELECT TOP (1) d.ID
            FROM @TDrivers AS d
            WHERE
                d.license_no = CTE_DuplicatedLicenses.license_no
                AND d.termination_date IS NOT NULL
            ORDER BY
                d.termination_date, d.ID
        ) AS CTE_Dates
)

最后一个CTE为我们提供了应更新的所有IDs

update @TDrivers
set license_no = NULL
where ID IN
(
    SELECT ID
    FROM CTE_LicensesWithDates
);

SELECT * FROM @TDrivers ORDER BY ID;

这是原始表:

ID   license_no   termination_date
1    NULL         2013-01-01
2    NULL         2013-01-01
3    NULL         2013-01-02
4    NULL         2013-01-03
5    NULL         2013-01-04
6    01           2014-01-01
7    02           NULL
8    03           2014-03-01
9    03           2014-03-02
10   04           2014-04-01
11   04           2014-04-01
12   05           2014-05-01
13   05           NULL
14   06           NULL
15   06           NULL
16   07           2014-07-01
17   07           2014-07-02
18   07           2014-07-03
19   08           2014-08-01
20   08           2014-08-01
21   08           2014-08-02
22   09           2014-09-01
23   09           2014-09-02
24   09           2014-09-03
25   09           NULL
26   10           2014-10-01
27   10           2014-10-02
28   10           2014-10-03
29   10           NULL
30   10           NULL
31   10           NULL
32   11           2014-11-01
33   11           2014-11-01
34   11           2014-11-02
35   11           2014-11-03
36   11           NULL
37   12           2014-12-01
38   12           2014-12-01
39   12           2014-12-02
40   12           2014-12-03
41   12           NULL
42   12           NULL
43   12           NULL

这是更新的表格:

ID   license_no   termination_date
1    NULL         2013-01-01
2    NULL         2013-01-01
3    NULL         2013-01-02
4    NULL         2013-01-03
5    NULL         2013-01-04
6    01           2014-01-01
7    02           NULL
8    NULL         2014-03-01
9    03           2014-03-02
10   NULL         2014-04-01
11   04           2014-04-01
12   NULL         2014-05-01
13   05           NULL
14   06           NULL
15   06           NULL
16   NULL         2014-07-01
17   07           2014-07-02
18   07           2014-07-03
19   NULL         2014-08-01
20   08           2014-08-01
21   08           2014-08-02
22   NULL         2014-09-01
23   09           2014-09-02
24   09           2014-09-03
25   09           NULL
26   NULL         2014-10-01
27   10           2014-10-02
28   10           2014-10-03
29   10           NULL
30   10           NULL
31   10           NULL
32   NULL         2014-11-01
33   11           2014-11-01
34   11           2014-11-02
35   11           2014-11-03
36   11           NULL
37   NULL         2014-12-01
38   12           2014-12-01
39   12           2014-12-02
40   12           2014-12-03
41   12           NULL
42   12           NULL
43   12           NULL

要运行此示例,只需将所有代码块放在一个脚本中即可。

您可以在结果中看到ID为8,10,12的行已更新。

ID为14,15的行未更新,因为它们都具有NULL日期。

对于每个其他许可证编号,您可以看到只有一行设置为NULL,即使多行具有相同的日期。

答案 2 :(得分:0)

我会使用这样的标准SQL:

UPDATE
  driver d
SET
  license_no = null
WHERE 
  Exists ( SELECT 
             d2.license_no 
           FROM 
             driver d2 
           WHERE 
             d2.license_no = d.license_no 
           GROUP BY 
             d2.license_no 
           HAVING 
             Count(d2.license_no) > 1 
         ) AND
  Exists ( SELECT 
             d3.license_no
           FROM 
             driver d3
           WHERE 
             d3.license_no = d.license_no AND
             d3.termination_date = d.termination_date AND 
             d3.termination_date IS NOT NULL AND
             d3.termination_date IN ( SELECT 
                                        Min(d4.termination_date) 
                                      FROM 
                                        driver d4 
                                      WHERE 
                                         d4.license_no = d.license_no )                            
         )

第一个Exists()函数检查重复项。

第二个Exists()函数检查termination_date约束。

希望它有所帮助!