由最后一个值过滤的累积不同计数 - T-SQL

时间:2016-04-28 11:53:30

标签: sql sql-server count distinct-values

我想提出与此处完全相同的答案:

Cumulative distinct count filtered by last value - DAX

但在SQL Server中。为方便起见,我正在复制整个问题描述。

我有一个数据集:

month   name    flag
1       abc     TRUE
2       xyz     TRUE
3       abc     TRUE
4       xyz     TRUE
5       abc     FALSE
6       abc     TRUE

我想计算名称'的月累积不同数量。按最后标记过滤'值(TRUE)。即我想得到一个结果:

month   count
1       1
2       2
3       2
4       2
5       1
6       2

在第5和第6个月' abc'应该被排除,因为标志切换为“假”'在第5个月。

我正在考虑使用" over"带"分区的条款"但是我没有任何经验,所以这对我来说很难。

更新

我已更新示例源数据的最后一行。 是: 6 abc FALSE 是: 6 abc TRUE

输出数据的最后一行。 是: 6 1 是: 6 2

从描述中可能没有看出它应该以这种方式工作,并且建议的答案不能解决这个问题。

更新2

我设法创建了一个给出结果的查询,但它很丑陋,我认为可以通过使用over子句来缩小。你能帮帮我吗?

select t5.month_current, count(*) as count from
    (select t3.month month_current, t4.month months_until_current, t3.name, t4.flag from
        (select name ,month from
            (select distinct name
                from Source_data) t1
            ,(select distinct month
                from Source_data) t2) t3
        left join
        Source_data t4
        on t3.name = t4.name and t3.month >= t4.month) t5
    inner join
    (select t3.month month_current, max(t4.month) real_max_month_until_current, t3.name from
        (select name ,month from
            (select distinct name
                from Source_data) t1
            ,(select distinct month
                from Source_data) t2) t3
        left join
        Source_data t4
        on t3.name = t4.name and t3.month >= t4.month
            group by
                t3.month, t3.name) t6
    on t5.month_current = t6.month_current
        and t5.months_until_current = t6.real_max_month_until_current
        and t5.name = t6.name
            where t5.flag = 'TRUE'
                group by t5.month_current

1 个答案:

答案 0 :(得分:1)

您可以将累积不同计数作为:

select t.*,
       sum(case when seqnum = 1 then 1 else 0 end) over (order by month) as cnt
from (select t.*,
             row_number() over (partition by name order by month) as seqnum
      from t
     ) t;

我不理解合并旗帜的逻辑。

您可以通过合并标志来复制问题中的结果:

      select t.*,
             sum(case when seqnum = 1 and flag = 'true' then 1
                      when seqnum = 1 and flag = 'false' then -1
                      else 0
                 end) over (order by month) as cnt
      from (select t.*,
                   row_number() over (partition by name, flag order by month) as seqnum
            from t
           ) t;