SQL查找唯一的ID

时间:2019-10-20 20:27:50

标签: sql window-functions

我想查找特定日期范围内活动唯一ID的数量

CREATE TABLE tbl_tmp
(
    id integer,
    start_dt date,
    end_dt date,
    status varchar(8)
);

INSERT INTO tbl_tmp VALUES (30, '2015-11-22','2015-11-22', 'Active');
INSERT INTO tbl_tmp VALUES (30, '2015-11-23', '2015-12-06', 'Active');
INSERT INTO tbl_tmp VALUES (40, '2015-11-26', '2015-11-26', 'Active');
INSERT INTO tbl_tmp VALUES (40, '2015-11-27', '2016-02-23', 'Active');
INSERT INTO tbl_tmp VALUES (30, '2015-12-06', '2015-12-07', 'Inactive');
INSERT INTO tbl_tmp VALUES (40, '2016-02-24', '2016-08-04', 'Active');

预期输出:

如果where子句位于开始日期> ='2015-11-22'和end_date <='2015-12-05',则唯一ID的计数应为2,因为30和40在该时间范围内均处于活动状态< / p>

如果where子句位于开始日期> ='2015-11-22'和end_date <='2015-12-10',则唯一ID的计数应为1,因为30从'2018-12-07'开始处于无效状态

2 个答案:

答案 0 :(得分:0)

假设记录完成了该时间段,我们可以计算范围内任何时间的所有活动状态,并减去任何不活动状态:

select (count(distinct (case when 
                             then t.id
                        end)
                       )
              ) -
        count(distinct (case when t.end_dt >= r.range_start and
                                  t.start_dt <= r.range_end and
                                   t.status = 'Inactive'
                             then t.id
                        end)
              )
      ) as num_active_for_entire_range        
from tbl_temp t cross join
     (select date '2015-11-22' as range_start, date '2015-12-05' as range_end from dual
     ) r;

H

实际上,如果过滤一次,这会更简单:

select (count(distinct t.id) -
        count(distinct (case when t.status = 'Inactive'
                             then t.id
                        end)
                       )
              )
      ) as num_active_for_entire_range        
from tbl_temp t cross join
     (select date '2015-11-22' as range_start, date '2015-12-05' as range_end
     ) r
where t.end_dt >= r.range_start and
      t.start_dt <= r.range_end

Here是db <>小提琴。

答案 1 :(得分:0)

我认为您更倾向于结束日期,因为30从22到06处于活动状态,但是您仍然希望其状态为非活动,因为从06到07处于非活动状态(在情况2中)。所以id的最新状态是什么您很感兴趣。

您可以尝试以下查询:

Select t.* -- or count(1) for fetching just count
(Select t.* , 
       row_number() 
        over (partition by id order by start_dt desc) as rn
from  your_table t
Where <date_conditions>) t
Where t.status = 'Active' and t.rn = 1;

干杯!