按时间范围对具有随时间变化的修饰符进行分组

时间:2013-01-30 19:41:32

标签: sql sql-server sql-server-2008 tsql

在仔细研究类似的问题并发现它从未提供完整的解决方案后,我终于找到了我无法解决的问题的核心。我正在寻找一个人可以开出一定数量药物的连续天数。因为处方的开始和结束,一个人可能存在多个非连续的间隔,即X个药物。以下SQL脚本生成我将暂时发布的查询的结果集:此外,我没有SQL Server 2012。

create table test
(pat_id int, cal_date date, grp_nbr int, drug_qty int,[ranking] int)
go
insert into test(pat_id,cal_date, grp_nbr,drug_qty,[ranking])
values
(1, '1/8/2007',7,2, 1),
(1, '1/9/2007',7,2, 1),
(1, '1/10/2007',7,  2,1),
(1, '1/11/2007',7,  2,1),
(1, '1/12/2007',7,  2,1),
(1, '1/13/2007',7,  2,1),
(1, '1/14/2007',7,  2,1),
(1, '1/15/2007',7,  2,1),
(1, '6/1/2007',7,2, 1),
(1, '6/2/2007',7,2, 1),
(1, '6/3/2007',7,2, 1)

请注意,这个人有两个非连续的间隔,这个人同时使用两种药物。在省略的日子里,drug_qty超过两个。本例中的最后一列是我尝试添加另一个可以分组的字段来帮助解决问题(不起作用)。

查询以创建表格:

 CREATE TABLE [dbo].[rx](
            [pat_id] [int] NOT NULL,
            [fill_Date] [date] NOT NULL,
            [script_End_Date]  AS (dateadd(day,[dayssup],[filldate])),
            [drug_Name] [varchar](50) NULL,
            [days_Sup] [int] NOT NULL,
            [quantity] [float] NOT NULL,
            [drug_Class] [char](3) NOT  NULL,
            CHECK(fill_Date <=script_End_Date
PRIMARY KEY CLUSTERED 
(
            [clmid] ASC
)


CREATE TABLE [dbo].[Calendar](
             [cal_date] [date] PRIMARY KEY,
[Year] AS YEAR(cal_date) PERSISTED,
[Month] AS MONTH(cal_date) PERSISTED,
[Day] AS DAY(cal_date) PERSISTED,
             [julian_seq] AS 1+DATEDIFF(DD, CONVERT(DATE, CONVERT(varchar,YEAR(cal_date))+'0101'),cal_date),
     id int identity);

我用来生成结果集的查询:

;WITH x 
     AS (SELECT rx.pat_id, 
                c.cal_date, 
                Count(DISTINCT rx.drug_name) AS distinctDrugs 
         FROM   rx, 
                calendar AS c 
         WHERE  c.cal_date BETWEEN rx.fill_date AND rx.script_end_date 
                AND rx.ofinterest = 1 
         GROUP  BY rx.pat_id, 
                   c.cal_date 
         --the query example I used having count(1) =2, but to illustrate the non-contiguous intervals, in practice I need the below having statement
         HAVING Count(*) > 1), 
     y 
     AS (SELECT x.pat_id, 
                x.cal_date 
                --c2.id is the row number in the calendar table. 
                , 
                c2.id - Row_number() 
                          OVER( 
                            partition BY x.pat_id 
                            ORDER BY x.cal_date) AS grp_nbr, 
                distinctdrugs 
         FROM   x, 
                calendar AS c2 
         WHERE  c2.cal_date = x.cal_date) 
SELECT *, 
       Rank() 
         OVER( 
           partition BY pat_id, grp_nbr 
           ORDER BY distinctdrugs) AS [ranking] 
FROM   y 
WHERE  y.pat_id = 1604012867 
       AND distinctdrugs = 2 

除了我不应该在名为'id'的日历表中有一个列之外,这种方法有什么特别的错误吗?我可以通过查询向我展示distinctDrugs = x的不同间隔,但它只适用于该整数而不是任何&gt; 1。通过这个我的意思是我可以找到患者使用两种药物的单独间隔,但只有当我在有条款中使用= 2时,不是> 1。我不能做像

这样的事情
SELECT pat_id, 
       Min(cal_date), 
       Max(cal_date), 
       distinctdrugs 
FROM   y 
GROUP  BY pat_id, 
          grp_nbr 

因为这将获得第二组非连续日期。有谁知道这个问题的优雅解决方案?

1 个答案:

答案 0 :(得分:1)

关键是一个简单的观察。如果你有一系列日期,那么它们之间的差异和增加的顺序是不变的。假设您使用的是SQL Server 2005或更高版本,以下操作:

select pat_id, MIN(cal_date), MAX(cal_date), MIN(drug_qty)
from (select t.*,
             cast(cal_date as datetime) - ROW_NUMBER() over (partition by pat_id, drug_qty order by cal_date) as grouping
      from #test t
     ) t
group by pat_id, grouping