标记满足两个条件且在第三个

时间:2018-04-26 12:02:42

标签: sql-server

我需要标记所有第一次出现(table_a中的最低ID),其中两个条件(customer和user)与另一个表(table_b中的customer和user)中的条件匹配。这个问题的一个非常简化的版本是:

表-A

 Id          Customer        Users         
-----        --------        ----
 100           1001           abc
 101           1001           abc
 102           1001           xyz
 103           1001           xyz
 104           1002           abc
 105           1002           abc
 106           1002           xyz
 107           1002           xyz

表-B

Customer   Users    
--------   -----
  1001      abc
  1002      xyz

我想要的是什么:

 Id          Customer        User     include         
-----        --------        ----     -------
 100           1001           abc        1
 101           1001           abc        0
 102           1001           xyz        0
 103           1001           xyz        0
 104           1002           abc        0
 105           1002           abc        0
 106           1002           xyz        1  
 107           1002           xyz        0

这就是我的尝试:

select a.*, case when exists(
    select 1
    from table_a a1, table_b b
    where a.customer=b.customer
    and a.user=b.user
    having min(a1.id)=a.id
    )
then 1 else 0 end as include

但仅标记整个列表中的第一行(最低ID)。如果第一行中未满足条件(用户和客户组合与table_b上的组合不匹配),则无标记。

我想念这里有一些逻辑。有什么建议吗?真正的table_a有数百万行,所以速度是一个问题。所以除了逻辑,我可能还需要一些速度魔法。

完整的代码在这里:

DROP TABLE IF EXISTS #table_a
DROP TABLE IF EXISTS #table_b

create table #table_a (Id char(3),Customer char(4),Users char(3))
insert into #table_a (Id,Customer,Users) values

('100','1001','abc'),
('101','1001','abc'),
('102','1001','xyz'),
('103','1001','xyz'),
('104','1002','abc'),
('105','1002','abc'),
('106','1002','xyz'),
('107','1002','xyz')

create table #table_b (Customer char(4),Users char(3))
insert into #table_b (Customer,Users) values
('1001','abc'),
('1002','xyz')

    select a.*
    , case when exists(
        select *
        from #table_a a1, #table_b b
        where a.customer=b.customer
        and a.users=b.users
        having min(a1.id)=a.id
        )

    then 1 else 0 end as include
    from #table_a a

3 个答案:

答案 0 :(得分:1)

如果你可以在你的SQL Server版本中使用窗口函数,那就可以了。

WITH Includes AS (
    SELECT 
        a.*,
        CASE WHEN b.Customer IS NOT NULL THEN 1 ELSE 0 END AS [include],
        ROW_NUMBER() OVER (PARTITION BY a.Customer, a.Users ORDER BY a.Id) AS include_id
    FROM 
        #table_a a
        LEFT JOIN #table_b b ON b.Customer = a.Customer AND b.Users = a.Users)
SELECT
    a.*,
    CASE WHEN i.include_id = 1 THEN i.[include] ELSE 0 END AS [include]
FROM
    #table_a a
    LEFT JOIN Includes i ON i.Id = a.Id;

基本上它会建立一个匹配列表,然后使用ROW_NUMBER()从每个组中挑选第一个。

答案 1 :(得分:1)

您可以尝试使用以下查询:

SELECT a.Id, a.Customer, a.Users, 
       CASE 
          WHEN SUM(IIF(b.Customer IS NOT NULL, 1, 0)) 
               OVER (PARTITION BY a.Customer ORDER BY a.Id) = 1 THEN 1
          ELSE 0
       END AS include 
FROM #table_a AS a
LEFT JOIN #table_b AS b ON a.Customer = b.Customer AND a.Users = b.Users

该查询假设#table_a#table_b之间的最多只有一个

<强>解释

查询使用带有SUM() OVER()子句的ORDER BY来计算具有匹配项的记录的运行总数。所以,这个查询:

SELECT a.Id, a.Customer, a.Users, 
       SUM(IIF(b.Customer IS NOT NULL, 1, 0)) 
       OVER (PARTITION BY a.Customer ORDER BY a.Id) AS cnt
FROM table_a AS a
LEFT JOIN table_b AS b ON a.Customer = b.Customer AND a.Users = b.Users

生成此输出:

Id  Customer  Users cnt
-----------------------
100 1001      abc   1
101 1001      abc   2
102 1001      xyz   2
103 1001      xyz   2
104 1002      abc   0
105 1002      abc   0
106 1002      xyz   1
107 1002      xyz   2

我们正在寻找具有cnt=1的记录。

Demo here

答案 2 :(得分:0)

SELECT a.*, 
CASE WHEN a.Id = a1.Id THEN 1 ELSE 0 END AS [include]
FROM #table_a a
    LEFT JOIN #table_b b ON a.Customer = b.Customer
        AND a.Users = b.Users
    OUTER APPLY (
        SELECT TOP 1 Id
        FROM #table_a a
        WHERE  a.Customer = b.Customer
            AND a.Users = b.Users
        ) a1