我有两个表格Diagnose&行使 我想提取最接近Diagnose_Date的锻炼日期,它应该是锻炼表中的1行。
我已尝试在条件
的情况下使用DATEDIFF函数进行左连接SELECT D.ID,D.Diagnose_Date,D.Type1,D.Type2,E.Exercise_Date],E.Field1,E.Field2,E.Field3
FROM Diagnose D
LEFT JOIN Exercise E
ON D.ID=E.ID
WHERE DATEDIFF(DAY,[Diagnose_Date],[Exercise_Date]) BETWEEN -30 AND 30
任何帮助都会非常有帮助
先谢谢
诊断表
------------------------------------------
ID Dignose_Date Type1 SubType1
------------------------------------------
1 10/01/2010 01 1.1
2 20/02/2012 02 2.2
3 30/03/2013 01 1.2
------------------------------------------
练习表
------------------------------------------
ID Exercise_Date Field1 Field2 Field3
------------------------------------------
1 01/01/2010 x y z
2 10/02/2012 a b c
2 01/04/2012 e f f
3 01/03/2013 x y z
3 05/04/2013 a b c
3 01/06/2013 x y z
------------------------------------------
预期结果应为:
------------------------------------------------------------------------
ID Diagnose_Date Exercise_Date Type1 SubType2 Field1 Field2 Field3
------------------------------------------------------------------------
1 10/01/2010 01/01/2010 01 1.1 x y z
2 20/02/2012 10/02/2012 02 2.2 a b c
3 30/03/2013 05/04/2013 01 1.2 a b c
-------------------------------------------------------------------------
答案 0 :(得分:2)
首先,在CTE中,对于每次诊断,获得诊断日期与诊断相关的所有锻炼日期之间的最小时间间隔。
WITH MIN_DATES_CTE(ID, DATE_DIFF)
AS (
SELECT ID, MIN(ABS(DATEDIFF(DAY,[Diagnose_Date],[Exercise_Date])))
FROM Exercise E
INNER JOIN Diagnose D ON D.ID = E.ID
GROUP BY E.ID
)
然后,按ID和最小时间间隔加入诊断和练习
SELECT D.ID,D.Diagnose_Date,D.Type1,D.Type2,E.Exercise_Date],E.Field1,E.Field2,E.Field3
FROM Diagnose D
LEFT JOIN Exercise E ON D.ID = E.ID
INNER JOIN MIN_DATES_CTE ON MIN_DATES_CTE.ID = E.ID
WHERE ABS(DATEDIFF(DAY,[Diagnose_Date],[Exercise_Date])) = MIN_DATES_CTE.DATE_DIFF
答案 1 :(得分:1)
我假设您只是将任何单个诊断条目与任何单个运动条目匹配,基于它们彼此最接近的日期。
这是我的思路:
对诊断和练习进行全面JOIN
,按绝对日期差异排序,升序。
SELECT
D.ID,
D.Date,
E.ID,
E.Date,
ABS(DATEDIFF(day, D.Date, E.Date)) Diff
FROM Diagnosis D, Exercise E
ORDER BY Diff
你会得到这样的结果:
ID Date ID Date Diff
3 2013-03-30 5 2013-03-25 5
2 2012-02-20 2 2012-02-10 10
3 2013-03-30 4 2013-03-01 29
2 2012-02-20 3 2012-04-01 41
3 2013-03-30 6 2013-06-01 63
1 2010-10-01 1 2010-01-01 273
3 2013-03-30 3 2012-04-01 363
2 2012-02-20 4 2013-03-01 375
2 2012-02-20 5 2013-03-25 399
3 2013-03-30 2 2012-02-10 414
2 2012-02-20 6 2013-06-01 467
1 2010-10-01 2 2012-02-10 497
1 2010-10-01 3 2012-04-01 548
2 2012-02-20 1 2010-01-01 780
1 2010-10-01 4 2013-03-01 882
1 2010-10-01 5 2013-03-25 906
1 2010-10-01 6 2013-06-01 974
3 2013-03-30 1 2010-01-01 1184
现在,您可以看到彼此最接近的日期,以及它们远的天数。
当然,你不会使用它,但是从这个列表中,你可以选择第一个:
SELECT TOP 1
D.ID,
D.Date,
E.ID,
E.Date,
ABS(DATEDIFF(day, D.Date, E.Date)) Diff
FROM Diagnosis D, Exercise E
ORDER BY Diff
现在您可以在LEFT
联接中插入此语句,这样您就可以单独选择与其他人匹配的日期。
像这样:
SELECT
fD.ID,
fD.Date,
fE.ID,
fE.Date
FROM
Diagnosis fD
LEFT JOIN Exercise fE
ON fE.ID = (SELECT TOP 1 E.ID
FROM Diagnosis D, Exercise E
WHERE D.ID = fD.ID
ORDER BY ABS(DATEDIFF(day, D.Date, E.Date)))
结果如下:
ID Date ID Date
1 2010-10-01 1 2010-01-01
2 2012-02-20 2 2012-02-10
3 2013-03-30 5 2013-03-25
答案 2 :(得分:1)
您可以使用OUTER APPLY
SELECT d.ID,
d.Diagnose_Date,
d.Type1,
d.SubType1,
e.Exercise_Date,
e.Field1,
e.Field2,
e.Field3
FROM Diagnose d
OUTER APPLY
( SELECT TOP 1 Exercise_Date, Field1, Field2, Field3
FROM Exercise e
WHERE d.ID = e.ID
AND DATEDIFF(DAY, d.[Diagnose_Date], e.[Exercise_Date]) BETWEEN -30 AND 30
ORDER BY ABS(DATEDIFF(DAY, d.[Diagnose_Date], e.[Exercise_Date]))
) e;
<强> Example on SQL Fiddle 强>
我对此做了更多测试,发现使用ROW_NUMBER()
的方法效率最高:
WITH CTE AS
( SELECT d.ID,
d.Diagnose_Date,
d.Type1,
d.SubType1,
e.Exercise_Date,
e.Field1,
e.Field2,
e.Field3,
RowNumber = ROW_NUMBER() OVER (PARTITION BY d.ID ORDER BY ABS(DATEDIFF(DAY,[Diagnose_Date],[Exercise_Date])))
FROM Diagnose D
LEFT JOIN Exercise E
ON D.ID = E.ID
)
SELECT ID,
Diagnose_Date,
Type1,
SubType1,
EID = ID,
Exercise_Date,
Field1,
Field2,
Field3
FROM CTE
WHERE RowNumber = 1;
我将此与我的第一个解决方案进行了比较,并将答案与最多的投票进行了比较。结果如下:
外部申请
Cost relative to batch: 34%
--------------------------------------------------
Table 'Exercise'. Scan count 3, logical reads 3
Table 'Diagnose'. Scan count 1, logical reads 1
--------------------------------------------------
Total. Scan count 4, logical reads 4
与AGGREGATES一起自我加入(迄今为止投票最多)
Cost relative to batch: 51%
--------------------------------------------------
Table 'Worktable'. Scan count 0, logical reads 0
Table 'Exercise'. Scan count 2, logical reads 4
Table 'Diagnose'. Scan count 2, logical reads 2
--------------------------------------------------
Total. Scan count 4, logical reads 6
<强> ROW_NUMBER()强>
Cost relative to batch: 15%
--------------------------------------------------
Table 'Exercise'. Scan count 1, logical reads 3
Table 'Diagnose'. Scan count 1, logical reads 1
--------------------------------------------------
Total. Scan count 2, logical reads 4
<强> Examples on SQL Fiddle 强>
因此ROW_NUMBER
解决方案具有最低的IO统计信息和最低的估算成本
答案 3 :(得分:0)
仅使用标准SQL:
SELECT D.ID, D.Diagnose_Date, D.Type1, D.SubType1, E.Exercise_Date, E.Field1, E.Field2, E.Field3
FROM Diagnose D
LEFT JOIN Exercise E
ON E.ID=D.ID AND
E.Exercise_Date=(SELECT MAX(Exercise_Date) FROM Exercise WHERE Exercise.ID=D.ID AND Exercise.Exercise_Date<=D.Diagnose_Date)