我对T-SQl很新,但我需要知道这是否可行。我的数据看起来像这样:
location_id,initial_date,final_date,asset_id,fixed_fee,uid
1,1/1/2005 0:00,11/3/2010 0:00,10025,21,22T0TG9UT
1,1/1/2005 0:00,7/26/2010 0:00,10026,21,22T0TG8AC
1,1/1/2005 0:00,7/26/2010 0:00,10027,21,22T0TG8AF
1,1/1/2005 0:00,4/20/2011 0:00,10028,21,22T0TG8AI
1,6/13/2011 0:00,12/31/2048 0:00,10028,12.5,38P0WGUV3
1,4/20/2011 0:00,6/13/2011 0:00,10028,21,3770QEMG1
1,4/20/2011 0:00,6/13/2011 0:00,10029,21,3770QEUYX
1,6/13/2011 0:00,12/31/2048 0:00,10029,12.5,38P0WH6G4
1,1/1/2005 0:00,4/20/2011 0:00,10029,21,22T0TG8AK
1,1/1/2005 0:00,6/13/2011 0:00,10030,21,22T0TG8AM
1,6/13/2011 0:00,12/31/2048 0:00,10030,12.5,38P0WHG30
1,6/13/2011 0:00,12/31/2048 0:00,10031,12.5,38P0WHN50
1,1/1/2005 0:00,6/13/2011 0:00,10031,21,22T0TG8AR
1,1/1/2005 0:00,4/14/2014 0:00,10158,21,22T0TG8AW
1,4/15/2014 0:00,12/31/2048 0:00,10158,12.5,41M0TAZNL
1,4/15/2014 0:00,12/31/2048 0:00,10159,12.5,41M0TBXIS
1,1/1/2005 0:00,4/14/2014 0:00,10159,21,22T0TG8B0
1,1/1/2005 0:00,4/14/2014 0:00,10160,21,22T0TG8B2
1,4/15/2014 0:00,12/31/2048 0:00,10160,12.5,41M0TCKZM
1,4/15/2014 0:00,12/31/2048 0:00,10161,12.5,41M0TD5P7
1,1/1/2005 0:00,4/14/2014 0:00,10161,21,22T0TG8BH
1,1/1/2005 0:00,7/26/2010 0:00,10162,21,22T0TG8BJ
1,1/1/2005 0:00,11/3/2010 0:00,10163,21,22T0TG8BL
1,1/1/2005 0:00,7/26/2010 0:00,10164,21,22T0TG8BN
1,12/13/2010 0:00,12/31/2048 0:00,10333,15,33L0OR1MH
1,12/13/2010 0:00,12/31/2048 0:00,10334,15,33L0ORB5R
1,1/1/2005 0:00,12/31/2048 0:00,10336,5,22T0TG8BQ
1,1/1/2005 0:00,12/31/2048 0:00,10337,5,22T0TG8BR
1,1/1/2005 0:00,12/31/2048 0:00,10338,5,22T0TG8BT
1,1/1/2005 0:00,12/31/2048 0:00,10339,5,22T0TG8BV
我遇到的是,某些资产被移动或其费用结构发生变化。在那时他们被给予final_date(12/31/2048只是一个占位符结束日期),然后创建一个具有相同信息的新资产,但是新的初始日期,UID和final_date为12/31 / 2048。
日期不允许重叠,似乎造成了大量其他错误。如果费用结构在1月1日发生变化,则列出的下一个日期必须是1月2日。所以我需要使用一些T-SQL来检查每个日期范围与下一个日期范围,无论asset_id和location_id在多行中匹配。
对于这方面的任何建议或一般指导,我将不胜感激。救命啊!
答案 0 :(得分:0)
是的,用SQL可以解决这样的问题!
首先,我提出了这个解决方案,这是相当直接的。 它只是查看表中具有匹配记录(基于资产ID和位置ID)的所有记录,其中日期范围重叠,使用BETWEEN运算符检查可能的组合。
SELECT
A.[uid],
A.[location_id],
A.[asset_id],
A.[initial_date],
A.[final_date],
A.[fixed_fee]
FROM [tbl] A
WHERE EXISTS
(
SELECT
*
FROM [tbl] B
WHERE A.[location_id] = B.[location_id]
AND A.[asset_id] = B.[asset_id]
AND A.[uid] != B.[uid]
AND (
A.[initial_date] BETWEEN B.[initial_date] AND B.[final_date]
OR A.[final_date] BETWEEN B.[initial_date] AND B.[final_date]
OR B.[initial_date] BETWEEN A.[initial_date] AND A.[final_date]
OR B.[final_date] BETWEEN A.[initial_date] AND A.[final_date]
)
)
ORDER BY
A.[location_id],
A.[asset_id],
A.[initial_date]
......但后来我觉得自己有点创意,也想出了这个。它给出了相同的结果,但是这个结果构建了一系列所有可能的日期,然后将其连接到数据以查找哪些资产+位置组合有多个条目,然后它返回它找到的所有记录。
WITH [cte_min_max] AS
(
SELECT
MIN([initial_date]) [min_date],
MAX([initial_date]) [max_date]
FROM #tbl
),
[cte_recursion] AS
(
SELECT
[min_date] [date],
[max_date]
FROM [cte_min_max]
UNION ALL
SELECT
DATEADD(DAY, 1, [date]),
[max_date]
FROM [cte_recursion]
WHERE [date] <= [max_date]
),
[cte_duplicates] AS
(
SELECT
A.[date],
B.[location_id],
B.[asset_id]
FROM [cte_recursion] A
INNER JOIN #tbl B
ON A.[date] BETWEEN B.[initial_date] AND B.[final_date]
GROUP BY
A.[date],
B.[location_id],
B.[asset_id]
HAVING COUNT(*) >= 2
)
SELECT
A.[uid],
A.[location_id],
A.[asset_id],
A.[initial_date],
A.[final_date],
A.[fixed_fee]
FROM #tbl A
WHERE EXISTS
(
SELECT
*
FROM [cte_duplicates] B
WHERE A.[location_id] = B.[location_id]
AND A.[asset_id] = B.[asset_id]
AND B.[date] IN (A.[initial_date], A.[final_date])
)
ORDER BY
A.[location_id],
A.[asset_id],
A.[initial_date]
OPTION (MAXRECURSION 0)