我需要标记所有第一次出现(table_a中的最低ID),其中两个条件(customer和user)与另一个表(table_b中的customer和user)中的条件匹配。这个问题的一个非常简化的版本是:
表-A
Id Customer Users
----- -------- ----
100 1001 abc
101 1001 abc
102 1001 xyz
103 1001 xyz
104 1002 abc
105 1002 abc
106 1002 xyz
107 1002 xyz
表-B
Customer Users
-------- -----
1001 abc
1002 xyz
我想要的是什么:
Id Customer User include
----- -------- ---- -------
100 1001 abc 1
101 1001 abc 0
102 1001 xyz 0
103 1001 xyz 0
104 1002 abc 0
105 1002 abc 0
106 1002 xyz 1
107 1002 xyz 0
这就是我的尝试:
select a.*, case when exists(
select 1
from table_a a1, table_b b
where a.customer=b.customer
and a.user=b.user
having min(a1.id)=a.id
)
then 1 else 0 end as include
但仅标记整个列表中的第一行(最低ID)。如果第一行中未满足条件(用户和客户组合与table_b上的组合不匹配),则无标记。
我想念这里有一些逻辑。有什么建议吗?真正的table_a有数百万行,所以速度是一个问题。所以除了逻辑,我可能还需要一些速度魔法。
完整的代码在这里:
DROP TABLE IF EXISTS #table_a
DROP TABLE IF EXISTS #table_b
create table #table_a (Id char(3),Customer char(4),Users char(3))
insert into #table_a (Id,Customer,Users) values
('100','1001','abc'),
('101','1001','abc'),
('102','1001','xyz'),
('103','1001','xyz'),
('104','1002','abc'),
('105','1002','abc'),
('106','1002','xyz'),
('107','1002','xyz')
create table #table_b (Customer char(4),Users char(3))
insert into #table_b (Customer,Users) values
('1001','abc'),
('1002','xyz')
select a.*
, case when exists(
select *
from #table_a a1, #table_b b
where a.customer=b.customer
and a.users=b.users
having min(a1.id)=a.id
)
then 1 else 0 end as include
from #table_a a
答案 0 :(得分:1)
如果你可以在你的SQL Server版本中使用窗口函数,那就可以了。
WITH Includes AS (
SELECT
a.*,
CASE WHEN b.Customer IS NOT NULL THEN 1 ELSE 0 END AS [include],
ROW_NUMBER() OVER (PARTITION BY a.Customer, a.Users ORDER BY a.Id) AS include_id
FROM
#table_a a
LEFT JOIN #table_b b ON b.Customer = a.Customer AND b.Users = a.Users)
SELECT
a.*,
CASE WHEN i.include_id = 1 THEN i.[include] ELSE 0 END AS [include]
FROM
#table_a a
LEFT JOIN Includes i ON i.Id = a.Id;
基本上它会建立一个匹配列表,然后使用ROW_NUMBER()
从每个组中挑选第一个。
答案 1 :(得分:1)
您可以尝试使用以下查询:
SELECT a.Id, a.Customer, a.Users,
CASE
WHEN SUM(IIF(b.Customer IS NOT NULL, 1, 0))
OVER (PARTITION BY a.Customer ORDER BY a.Id) = 1 THEN 1
ELSE 0
END AS include
FROM #table_a AS a
LEFT JOIN #table_b AS b ON a.Customer = b.Customer AND a.Users = b.Users
该查询假设#table_a
与#table_b
之间的最多只有一个
<强>解释强>
查询使用带有SUM() OVER()
子句的ORDER BY
来计算具有匹配项的记录的运行总数。所以,这个查询:
SELECT a.Id, a.Customer, a.Users,
SUM(IIF(b.Customer IS NOT NULL, 1, 0))
OVER (PARTITION BY a.Customer ORDER BY a.Id) AS cnt
FROM table_a AS a
LEFT JOIN table_b AS b ON a.Customer = b.Customer AND a.Users = b.Users
生成此输出:
Id Customer Users cnt
-----------------------
100 1001 abc 1
101 1001 abc 2
102 1001 xyz 2
103 1001 xyz 2
104 1002 abc 0
105 1002 abc 0
106 1002 xyz 1
107 1002 xyz 2
我们正在寻找具有cnt=1
的记录。
答案 2 :(得分:0)
SELECT a.*,
CASE WHEN a.Id = a1.Id THEN 1 ELSE 0 END AS [include]
FROM #table_a a
LEFT JOIN #table_b b ON a.Customer = b.Customer
AND a.Users = b.Users
OUTER APPLY (
SELECT TOP 1 Id
FROM #table_a a
WHERE a.Customer = b.Customer
AND a.Users = b.Users
) a1