一个有趣的,可能很复杂的SQL查询,用于确定连续月份的数量

时间:2011-11-01 20:54:10

标签: sql sql-server-2005 sql-server-2008

首先感谢所有帮助我回答这个问题的人!我用google搜索并试图尽可能多地学习递归CTE和MS SQL 05-08的其他高级功能,但我很难过。

我有一个看起来像这样的表:

AgentID, GoalReached, MonthEndDate
360123, 1, 1/22/2011
360123, Null, 2/22/2011
360123, 1, 3/22/2011
360123, 1, 4/22/2011
360123, 1, 5/22/2011
360123, 2, 6/22/2011
360124, 2, 1/22/2011
360124, 2, 2/22/2011
360124, 1, 3/22/2011
360124, 2, 4/22/2011
360124, 2, 5/22/2011
360124, 3, 6/22/2011

如何根据agentid生成一个表格,并且只显示在连续6个月内达到4级的年龄?然后显示达到目标的6个月期限?

此外,如果代理人达到2级或更高级别,那么他们也会达到之前的级别1.因此,如果代理人在6个月内有3个级别2和3个级别1,那么代理人只达到1级。

最终结果将是:

AgentID, LevelReached, StartDate, EndDate
360123,1,3/22/2011,6/22/2011
360124,2,2/22/2011,6/22/2011

感谢您提供的任何见解。

1 个答案:

答案 0 :(得分:2)

我想出了一个看起来有效的方法:

declare @t table ( AgentID int, GoalReached int, MonthEndDate date)
insert @t (AgentID , GoalReached , MonthEndDate )
values 
(360123, Null, '2/22/2011'),
(360123, 1, '1/22/2011'),
(360123, 1, '3/22/2011'),
(360123, 1, '4/22/2011'),
(360123, 1, '5/22/2011'),
(360123, 2, '6/22/2011'),
(360124, 1, '3/22/2011'),
(360124, 2, '1/22/2011'),
(360124, 2, '2/22/2011'),
(360124, 2, '4/22/2011'),
(360124, 2, '5/22/2011'),
(360124, 3, '6/22/2011')

select  t1.AgentID, t1.GoalReached, t1.MonthEndDate StartDate, max(t2.MonthEndDate) Enddate
from    @t t1
        inner join @t t2 
            on t1.AgentID = t2.AgentID
            and t1.GoalReached = t2.GoalReached
            and datediff(m, t1.MonthEndDate, t2.MonthEndDate) between 0 and 6  -- find any other rows for same agent within 6 months
group by t1.AgentID, t1.GoalReached, t1.MonthEndDate
having count(*) >= 4

但我得到了这些行:

360123  1   2011-01-22  2011-05-22
360124  2   2011-01-22  2011-05-22

但我认为您提供的示例输出可能已关闭。 (代理360123的6/22结束日期是2级,而不是1.对于360124,该日期是3级而不是2级。

如果这样做,请告诉我。您尚未指定如何处理例如每月达到目标的连续9个月的情况。例如,此查询将重叠时段计为两个,因此请告知我,例如,一行是否只能在一个组中。

关心GJ

修改

嗨,好的就是这样,见下文。

为了删除重叠序列,我使用了递归CTE。我认为这是一个唯一的方法 如果某个行已经在前一个组中,则“分组”。有兴趣看看其他方法。

就我个人而言,我会考虑使用更加“程序化”的方式来使用光标或实际更新行的内容将它们分组。这个查询很难得到解决,可能是未来代码维护的问题。

因此,尽管如此,享受!感谢这一个,不错的小破口; - )

Rgds GJ

declare @t table (AgentID int, GoalReached int, MonthEndDate date)
insert @t (AgentID , GoalReached , MonthEndDate )
values 
(360123, Null, '2/22/2011'),
(360123, 1, '1/22/2011'),
(360123, 1, '3/22/2011'),
(360123, 1, '4/22/2011'),
(360123, 1, '5/22/2011'),
(360123, 2, '6/22/2011'),
(360124, 1, '3/22/2011'),
(360124, 2, '1/22/2011'),
(360124, 2, '2/22/2011'),
(360124, 2, '4/22/2011'),
(360124, 2, '5/22/2011'),
(360124, 3, '6/22/2011'),

(100, 1, '1/1/2010'),
(100, 1, '2/1/2010'),
(100, 2, '3/1/2010'),
(100, 1, '4/1/2010'),
(100, 1, '5/1/2010'),
(100, 1, '6/1/2010'),
(100, 1, '7/1/2010'),
(100, 1, '8/1/2010'),
(100, 1, '9/1/2010')
;


with 
--1: find groups of rows that are within 6 months of eachother and number the rows
step1 as (
select  t1.AgentID, t1.MonthEndDate StartDate, t2.GoalReached, t2.MonthEndDate EndDate,  ROW_NUMBER() over (partition by t1.agentid, t1.MonthEndDate order by t2.monthenddate) RowRank
from    @t t1
        inner join @t t2 
            on t1.AgentID = t2.AgentID
            --and t1.GoalReached = t2.GoalReached
            and datediff(m, t1.MonthEndDate, t2.MonthEndDate) between 0 and 6  -- find any other rows for same agent within 6 months
)
--select * from step1
-- cut sequences off to no longer than 4 rows and get rid of shorter sequences
,step2 as (
    select  * 
    from    step1 t1
    where   exists (
        select *
        from    step1 t2
        where   t1.AgentID = t2.AgentID
                and t1.StartDate = t2.StartDate
                --t1.id1 = t2.id1       -- same group(same start row)
                and t2.RowRank  = 4 -- group has to have at least 4 rows 
                and t1.RowRank <= 4 -- get rid of rows that are beyond 4
    )
)
--select * from step2 
-- collapse groups to a single row (makes next step easier)
,grps as (
    select  AgentID,            
            MAX(GoalReached) MaxGoal,
            MIN(StartDate) StartDate,
            MAX(EndDate) EndDate,
            dense_rank() over (partition by AgentId order by StartDate) GrpRank
    from    step2
    group by agentid,
            StartDate
)
--select * from sub1
-- use common table expression to remove overlap (only way I could figure out how)
,cte as (
    -- anchor to first sequence of 4 rows for each agent
    select  AgentID,
            StartDate,
            EndDate,
            MaxGoal
    from    grps
    where   GrpRank = 1

    union all 

    -- repeat to find following sequences
    select  AgentID,
            StartDate,
            EndDate,
            MaxGoal     
    from    (
                select  g.*, row_number() over (partition by g.AgentId order by g.StartDate) grp_rank
                from    grps g              
                inner join cte c 
                        on  g.AgentID = c.AgentID   -- same agent                       
                        and g.StartDate > c.EndDate -- group must start after previous group has ended (here we remove the overlap)

            )s
    where   s.grp_rank = 1  -- only add 1 group per agent for each iteration of the CTE     
)
select  *
from    cte
order by AgentID, StartDate