我希望有人可以帮助我解决此问题的最佳方法。
我们的组织目前使用销售周期来根据零售商的首次发货日期来判断我们的零售商的业绩。这是业务规则:
Nurture Stage - 1st year
Graduate Stage - 2nd year
Ongoing Stage - 3rd year and on
Inactive Stage - stop doing business
Restart Stage - do business with us after an Inactive Stage
Change Owner Stage - sell their business and new owner does business with us
要使这种情况更加复杂,在任何给定时间,任何零售商都不能使用同一类型的程序。因此,如果他们从我们这里购买成品,那么他们也将无法加入自己购买所需材料的计划。
StageNo ProgramNo CustomerNo ProgramType StageDescription StartDate EndDate
CAPS041835 CAP010611 RL023238 Packaged Nurture 2019-04-04 2019-04-04
CAPS041836 CAP010611 RL023238 Packaged Inactive 2019-04-05 2999-01-01
CAPS041837 CAP010612 RL023238 Pre-Made in Bulk Nurture 2019-04-04 2999-01-01
以上是数据异常的一个示例。 01/01/2999仅表示这是我们ERP中的空白日期。
2019年4月4日,用户创建了打包程序,并决定应该将零售商设置为批量生产而不是打包。
ERP在上一个发票日期结束当前阶段,如果不存在,则将以今天的日期结束当前阶段,并从Today + 1开始启动非活动阶段。
因此,如果我运行分析,则04年4月4日的所有发货都将应用于打包程序和预制程序。
理想情况下,我想彻底摆脱打包程序,但是,如果不可能的话,这就是我要清理的程序:
StageNo ProgramNo CustomerNo ProgramType StageDescription StartDate EndDate
CAPS041835 CAP010611 RL023238 Packaged Nurture 2019-04-04 2019-04-04
CAPS041836 CAP010611 RL023238 Packaged Inactive 2019-04-04 2019-04-04
CAPS041837 CAP010612 RL023238 Pre-Made in Bulk Nurture 2019-04-04 2999-01-01
这样的话,我可以检查并修复它。即使我离开它,也不会是世界末日,因为我可以将发货日期设置为DateTime,然后+1秒,这意味着销售将只属于1个程序。
我首先编写查询以查找日期范围之间的差距,以查找日期差异小于0的任何差距。
这是我到目前为止所拥有的...
WITH CustomerProgram AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY [CustomerNo] ASC, [ProgramGroupId] ASC, [StageStartDate] ASC, [StageEndDate] ASC, [StagePrecedence] ASC, [CustomerProgramStageNo] ASC) AS [RowId]
,*
,COUNT([CustomerProgramStageNo]) OVER (PARTITION BY [ProgramGroupId]) AS [StageCount]
FROM
(
SELECT
--RANK() OVER (ORDER BY [CustomerNo] ASC, [ProgramDescription] ASC) AS [ProgramGroupId]
RANK() OVER (ORDER BY [CustomerNo] ASC, [CustomerProgramNo] ASC) AS [ProgramGroupId]
,[CustomerProgramNo]
,[CustomerProgramStageNo]
,[CustomerNo]
,[ProgramCode]
,[ProgramStageCode]
,[ProgramStageDescription]
,CASE [ProgramStageDescription]
WHEN 'Nurture' THEN 1
WHEN 'Graduate' THEN 2
WHEN 'Change Ownership' THEN 3
WHEN 'Restart' THEN 3
WHEN 'Ongoing' THEN 4
WHEN 'Inactive' THEN 5
ELSE NULL
END AS [StagePrecedence]
,CAST([StageStartDate] AS DATETIME) AS [StageStartDate]
,CAST([StageEndDate] AS DATETIME) AS [StageEndDate]
FROM
[CustomerProgramAndStage]
) CustomerProgram
)
,StagesAndGaps AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY [CustomerNo] ASC, [ProgramGroupId] ASC, [StageStartDate] ASC, [StageEndDate] ASC) AS [RowId]
,[ProgramGroupId]
,[StageCount]
,[CustomerNo]
,[DateRangeType]
,[StageStartDate]
,[StageEndDate]
,DATEDIFF(DAY,[StageStartDate],[StageEndDate]) AS [StageDateDayDiff]
,DATEDIFF(YEAR,[StageStartDate],[StageEndDate]) AS [StageDateYearDiff]
,[StartDateRowId]
,[EndDateRowId]
,[PreviousProgramCode]
,[NextProgramCode]
,[PreviousStagePrecedence]
,[NextStagePrecedence]
,[PreviousStageNo]
,[NextStageNo]
FROM
(
SELECT
[ProgramGroupId] AS [ProgramGroupId]
,[StageCount] AS [StageCount]
,[CustomerNo] AS [CustomerNo]
,[DateRangeType] AS [DateRangeType]
,ISNULL([StageStartDate],'1800-01-01') AS [StageStartDate]
,ISNULL([StageEndDate],'3999-01-01') AS [StageEndDate]
,ISNULL([StartDateRowId],0) AS [StartDateRowId]
,ISNULL([EndDateRowId],9999999) AS [EndDateRowId]
,ISNULL([PreviousProgramCode],'Start') AS [PreviousProgramCode]
,ISNULL([NextProgramCode],'End') AS [NextProgramCode]
,ISNULL([PreviousStagePrecedence],0) AS [PreviousStagePrecedence]
,ISNULL([NextStagePrecedence],999) AS [NextStagePrecedence]
,ISNULL([PreviousStageNo],'Start') AS [PreviousStageNo]
,ISNULL([NextStageNo],'End') AS [NextStageNo]
FROM
(
SELECT -- Gaps include time period before the start of a Program
NextStage.[ProgramGroupId] AS [ProgramGroupId]
,NextStage.[StageCount] AS [StageCount]
,NextStage.[CustomerNo] AS [CustomerNo]
,'Gap' AS [DateRangeType]
,PreviousStage.[StageEndDate] AS [StageStartDate]
,NextStage.[StageStartDate] AS [StageEndDate]
,PreviousStage.[RowId] AS [StartDateRowId]
,NextStage.[RowId] AS [EndDateRowId]
,PreviousStage.[ProgramCode] AS [PreviousProgramCode]
,NextStage.[ProgramCode] AS [NextProgramCode]
,PreviousStage.[StagePrecedence] AS [PreviousStagePrecedence]
,NextStage.[StagePrecedence] AS [NextStagePrecedence]
,PreviousStage.[CustomerProgramStageNo] AS [PreviousStageNo]
,NextStage.[CustomerProgramStageNo] AS [NextStageNo]
FROM
(
SELECT
[RowId]
,[ProgramGroupId]
,[StageCount]
,[CustomerProgramStageNo]
,[CustomerNo]
,[ProgramCode]
,[StagePrecedence]
,[StageStartDate]
FROM
CustomerProgram
) NextStage
LEFT JOIN
(
SELECT
[RowId]
,[ProgramGroupId]
,[StageCount]
,[CustomerProgramStageNo]
,[CustomerNo]
,[ProgramCode]
,[StagePrecedence]
,[StageEndDate]
FROM
CustomerProgram
) PreviousStage
ON NextStage.[ProgramGroupId] = PreviousStage.[ProgramGroupId]
AND NextStage.[RowId] - 1 = PreviousStage.[RowId]
UNION
SELECT -- Gaps include time period after the end of a Program (year 2999 if Stage is active)
PreviousStage.[ProgramGroupId] AS [ProgramGroupId]
,PreviousStage.[StageCount] AS [StageCount]
,PreviousStage.[CustomerNo] AS [CustomerNo]
,'Gap' AS [DateRangeType]
,PreviousStage.[StageEndDate] AS [StageStartDate]
,NextStage.[StageStartDate] AS [StageEndDate]
,PreviousStage.[RowId] AS [StartDateRowId]
,NextStage.[RowId] AS [EndDateRowId]
,PreviousStage.[ProgramCode] AS [PreviousProgramCode]
,NextStage.[ProgramCode] AS [NextProgramCode]
,PreviousStage.[StagePrecedence] AS [PreviousStagePrecedence]
,NextStage.[StagePrecedence] AS [NextStagePrecedence]
,PreviousStage.[CustomerProgramStageNo] AS [PreviousStageNo]
,NextStage.[CustomerProgramStageNo] AS [NextStageNo]
FROM
(
SELECT
[RowId]
,[ProgramGroupId]
,[StageCount]
,[CustomerProgramStageNo]
,[CustomerNo]
,[ProgramCode]
,[StagePrecedence]
,[StageEndDate]
FROM
CustomerProgram
) PreviousStage
LEFT JOIN
(
SELECT
[RowId]
,[ProgramGroupId]
,[StageCount]
,[CustomerProgramStageNo]
,[CustomerNo]
,[ProgramCode]
,[StagePrecedence]
,[StageStartDate]
FROM
CustomerProgram
) NextStage
ON PreviousStage.[ProgramGroupId] = NextStage.[ProgramGroupId]
AND PreviousStage.[RowId] + 1 = NextStage.[RowId]
UNION
SELECT -- Stage data
[ProgramGroupId] AS [ProgramGroupId]
,[StageCount] AS [StageCount]
,[CustomerNo] AS [CustomerNo]
,'Stage' AS [DateRangeType]
,[StageStartDate] AS [StageStartDate]
,[StageEndDate] AS [StageEndDate]
,[RowId] AS [StartDateRowId]
,[RowId] AS [EndDateRowId]
,[ProgramCode] AS [PreviousProgramCode]
,[ProgramCode] AS [NextProgramCode]
,[StagePrecedence] AS [PreviousStagePrecedence]
,[StagePrecedence] AS [NextStagePrecedence]
,[CustomerProgramStageNo] AS [PreviousStageNo]
,[CustomerProgramStageNo] AS [NextStageNo]
FROM
CustomerProgram
) StagesAndGaps
) StagesAndGaps
)
SELECT
*
FROM
StagesAndGaps
WHERE
[DateRangeType] = 'Gap'
AND [StageStartDate] NOT IN ('1800-01-01','2999-01-01')
ORDER BY
[RowId] ASC
我认为我朝着正确的方向前进,但我也不确定是否有更简单的方法。抱歉,很长的帖子,但是对您的任何帮助将不胜感激!
答案 0 :(得分:0)
您可以使用PARTITION,ORDER将数据集划分为有序的块,然后识别需要更新/删除的记录,这就是您尝试过的方法。但是,您可以更加精确。
例如,您刚刚使用ORDER BY,它将为您提供:
row_num customer_no阶段stage_startdate 1 1 A 2019-01-01 2 2 B 2019-12-30
在这里,您不能比较row_num 1和2,因为它们都属于两个不同的客户。
因此,首先使用PARTITION划分数据块,然后使用ORDER BY排列数据。
并且,除了更新之外,您还可以标记不需要的记录,然后将其删除。
为此,添加“ to_be_deleted”列以标记需要删除的记录。如果您使用的是SQL Server 2012+,则可以使用PARTITION输出顶部的LEAD()或LAG()轻松填充此“ to_be_deleted”列。 LEAD()或LAG()函数可帮助您与上一行或下一行进行比较。因此,您可以轻松检查重复项,然后对其进行标记,最终将其删除。
对于LEAD(),LAG(),您可以参考此:https://blog.sqlauthority.com/2011/11/15/sql-server-introduction-to-lead-and-lag-analytic-functions-introduced-in-sql-server-2012/
希望这对您有所帮助:)。好的主动性