如果createddate在id之间大于3个月,则SQL SMS 2008 -Count列ID和计数重复ID

时间:2016-01-04 14:48:31

标签: sql-server

*编辑(希望更清楚)

下面的表格,我想计算ID并计算重复的ID,其中createddate的ID为3个月或更长的差距。

查询我到目前为止......

if object_id('tempdb..#temp') is not null 
        begin drop table #temp end
select
top 100
 a.id, a.CreatedDate

into #temp
from tbl a

where 1=1
--and year(CreatedDate) = '2015'

if object_id('tempdb..#temp2') is not null 
        begin drop table #temp2 end
select t.id, count(t.id) as Total_Cnt
into #temp2
from #temp t
group by id

select distinct #temp2.Total_Cnt, #temp2.id, #temp.CreatedDate, DENSE_RANK()     over (partition by #temp.id order by createddate) RK
from #temp2
inner join #temp on #temp2.id = #temp.id

where 1=1

order by Total_Cnt desc

结果:

Total_cnt    id    createddate            rk
  3           1       01-01-2015          1
  3           1       03-02-2015          2
  3           1       01-02-2015          3               
  2           2       05-01-2015          1
  2           2       05-02-2015          2
  1           3       06-01-2015          1
  1           4       07-01-2015          1

计算ID并且仅当id的createddate大于3个月时才计算重复ID。

像这样......

Total_cnt    id     Countwith3monthgap
  3           1            2
  2           2            1
  1           3            1
  1           4            1

3 个答案:

答案 0 :(得分:1)

您可以使用cte和ROW_NUMBER获取订单,并根据订单自行加入cte ..

WITH cte AS 
(   SELECT
        *,
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedDate) Rn
    FROM
        Test
)
SELECT
    c1.ID,
    COUNT(CASE WHEN c2.CreatedDate IS NULL THEN 1
                WHEN c1.CreatedDate >= DATEADD(month,3,c2.CreatedDate) THEN 1
            END)
FROM
    cte c1
    LEFT JOIN cte c2 ON c1.ID = c2.ID
                        AND c1.RN = c2.RN + 1
GROUP BY
    c1.ID

您还需要使用条件计数,其中Previous CreatedDate为null或者Current CreatedDate为> = Previous CreatedDate + 3个月

如果你碰巧使用的是SQL 2012+,你也可以在这里使用LAG来获得相同的结果

SELECT
    ID,
    COUNT(*)
FROM
    (SELECT 
        ID, 
        CreatedDate CurrentDate,
        LAG(CreatedDate) OVER (PARTITION BY ID ORDER BY CreatedDate) PreviousDate
    FROM 
        Test
    ) T
WHERE
    PreviousDate IS NULL 
    OR CurrentDate >= DATEADD(month, 3, PreviousDate)
GROUP BY 
    ID

答案 1 :(得分:0)

您可以使用滞后来获取上一个日期,Null为列表中的第一个

SELECT 
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM @t 

您可以将其用作子查询,并使用DATEDIFF

获取月份差异
SELECT sub.id,DATEDiff(month, sub.PreviousCreateDate ,sub.CreatedDate) 
FROM (SELECT 
      id,
      lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
      CreatedDate
      FROM @t) sub
WHERE DATEDiff(month, sub.PreviousCreateDate ,sub.CreatedDate) >=3
OR sub.PreviousCreateDate IS NULL

然后你可以拿走你的总数

SELECT sub.id,COUNT(sub.id) as cnt
FROM (SELECT 
      id,
      lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
      CreatedDate
      FROM @t) sub
WHERE DATEDIFF(month, sub.PreviousCreateDate ,sub.CreatedDate) >=3
OR sub.PreviousCreateDate IS NULL
GROUP BY sub.id

请注意,在1月的最后一天使用datediff是在游行的第一天之前的三个月。这似乎是你追求的逻辑。

您可能希望将三个月的差距标准定义为

WHERE sub.PreviousCreateDate  <= DATEADD(month, -3, sub.CreatedDate)
OR sub.PreviousCreateDate IS NULL

WHERE sub.CreatedDate >= DATEADD(month, +3, sub.PreviousCreateDate )
OR sub.PreviousCreateDate IS NULL

答案 2 :(得分:0)

我猜你所希望的三个月差距的定义与datediff()不一致。这里的大多数逻辑是回顾前一个日期,并确定差距是否足以达到合格。

datediff()计算三个月差异时,我们仍需要确保一个月中的某一天晚于第一天(每个示例和ID 5 )。如果差异超过三个月,那么我们自动就好了。

但我也假设您希望将11月30日至2月28日(或闰年29日)的距离视为整整三个月,因为结束日期是在该月的最后一天。通过将结束日期调整一天,这是一个容易出现的情况,因为它会将日期推迟到下个月,并将月差异也增加一个。如果这不是您想要的,那么只需删除dateadd(day, 1, ...)部分,并仅使用原始CreatedDate值。

您的样本数据是有限的,因此我也假设在连续日期之间测量差距。如果您想要找到整个集合中不超过三个月的运行块,那么这是一个不同的问题,您应该提供更多信息。

由于您已经表明您可能在SQL Server 2008上,因此您将不得不使用lag()功能。虽然可以调整第一个查询,但最后可能更容易使用第二种方法。

with diffs as (
    select
        ID,
        row_number() over (partition by ID order by CreatedDate) as RN,
        case when
            datediff(
                month,
                lag(CreatedDate, 1) over (partition by ID order by CreatedDate),
                CreatedDate
            ) = 3
            and
            datepart(
                day,
                lag(CreatedDate, 1) over (partition by ID order by CreatedDate)
            ) <= datepart(day, CreatedDate)
            or
            datediff(
                month,
                lag(CreatedDate, 1) over (partition by ID order by CreatedDate),
                /* adding one day to handle gaps like Nov30 - Feb28/29 and Jan31 - Apr30 */
                dateadd(day, 1, CreatedDate)
            ) >= 4
            then 1
            else 0
        end as GapFlag
    from <T> /* <--- your table name here */
), gaps as (
    select
        ID, RN,
        sum(1 + GapFlag) over (partition by ID order by RN) as Counter
    from diffs
)
select ID, count(distinct Counter - RN) as "Count"
from gaps
group by ID

其余逻辑是典型的间隙和岛屿场景,寻找sum(1 + GapCount)序列中的漏洞,偏移量为1,与row_number()非常相似。

http://sqlfiddle.com/#!6/61b12/3

JamieD77的方法也有效。我原先认为你的问题不仅仅是按顺序查看行。这是我如何根据我一直在运行的差距定义进行调整:

with data as (
    select ID, CreatedDate, row_number() over (partition by ID order by CreatedDate) as RN
    from T
)
select ID, count(*) as "Count"
from data d1 left outer join data d0
    on d0.ID = d1.ID and d0.RN = d1.RN - 1 /* connect to the one before */
where
        datediff(month, d0.CreatedDate, d1.CreatedDate) = 3
    and datepart(day, d0.CreatedDate) <= datepart(day, d0.CreatedDate)
    or  datediff(month, d0.CreatedDate, dateadd(day, 1, d0.CreatedDate)) >= 4
    or  d0.ID is null
group by ID

修改:自昨天以来,您已经更改了问题。

在第一个查询中更改此行以包含总计数:

...
select count(*) as TotalCnt, ID, count(distinct Counter - RN) as GapCount
...

第二个看起来像:

with data as (
    select ID, CreatedDate, row_number() over (partition by ID order by CreatedDate) as RN
    from T
)
select
    count(*) as TotalCnt, ID,
    count(case when 
            datediff(month, d0.CreatedDate, d1.CreatedDate) = 3
        and datepart(day, d0.CreatedDate) <= datepart(day, d0.CreatedDate)
        or  datediff(month, d0.CreatedDate, dateadd(day, 1, d0.CreatedDate)) >= 4
        or  d0.ID is null then 1 end
    ) as GapCount
from data d1 left outer join data d0
    on d0.ID = d1.ID and d0.RN = d1.RN - 1 /* connect to the one before */
where
group by ID