假设有以下表格
表A
WorkId DateA
----- -------
1 01/01/2017
表B
WorkId DateB Flag User
----- ------- ---- -----
1 01/12/2016 N u1
1 03/12/2016 N u2
1 01/01/2017 Y u2
1 02/01/2017 Y u3
1 02/01/2017 Y u3
1 05/01/2017 N u4
1 05/01/2017 N u5
1 06/01/2017 N u5
1 10/01/2017 Y u5
1 12/01/2017 Y u6
1 12/01/2017 N u7
表A中的每个记录都应该基于TableA.WorkId = TableB.WorkId和TableA.DateA = TableB.DateB(表B中的此连接始终为Flag = y)连接到表B中的记录。基于此连接,我应该得到WorkId / TableA.DateA和TableB.User(下面结果中的user1)。例如,表A中的上述记录由表B的第三行连接。
然后我需要从表B获得第一条记录,其标志为N,并且在DateA之后具有最小日期。在这个例子中是表A中的第六条记录。然后我需要将此用户(user2)和日期(DateB)添加到结果中:
结果
WorkId DateA DateB User1 User2
----- ------- ------ ----- -----
1 01/01/2017 05/01/2017 u2 u4
我使用了以下查询
WITH c AS (
SELECT a.WorkId, a.DateA, b.User AS User1
FROM TableA a
INNER JOIN TableB b
ON a.WorkId = b.WorkId AND a.DateA = b.DateB
),
c1 AS (
SELECT c.*, b.DateB, b.User AS User2
, ROW_NUMBER() OVER (PARTITION BY b.WorkId, c.DateA ORDER BY b.DateB) AS rn
FROM c
LEFT OUTER JOIN TableB b
ON c.WorkId = b.WorkId AND b.Flag = 'N' AND b.DateB > c.DateA
)
SELECT *
FROM c1
WHERE rn = 1
我在每个表上都有两个索引WorkId + Data和Data。
问题是查询速度很慢,而且当表非常大时,查询速度会变得非常慢。你知道更快的代码吗?感谢。
答案 0 :(得分:1)
以下是制定查询的一种方法:
select a.*, b2.date as date2, b.user as user1, bnext.user as user2
from tableA a join
tableB b
on a.workid = b.workid and a.date = b.date outer apply
(select top 1 b2.*
from tableB b2
where b2.workId = a.workid and b2.date > a.date and b2.flag = 'N'
order by b2.date desc
) bnext;
对于join
,您需要tableB(workId, date)
上的索引 - 键可以按任意顺序排列。对于子查询,您需要tableB(workId, date, flag, user)
上的索引。这一个查询实际上就是你所需要的。
嗯。还有另一种方法可能更快:
select workid, date1, date as date2, user1, user as user2
from (select ab.*, min(date) over (partition by workid, grp) as date1,
max(user1) over (partition by workid, grp) as user1,
row_number() over (partition by workid, grp, flag) as seqnum
from (select b.*,
sum(case when a.workid is not null then 1 else 0 end) over (partition by b.workid order by b.date) as grp,
max(case when a.workid is not null then user end) as user1
from tableB b left join
tableA a
on a.workid = b.workid and a.date = b.date
) ab
) ab
where seqnum = 1 and flag = 'N';
这要复杂得多,它依赖于A中的行在B上的匹配中不相互重叠。这个想法是它在B中找到匹配,然后它使用窗口函数来查找标志为N的第一行