仅当每个用户的连续天数为30或更高时才获取记录

时间:2011-08-11 14:32:09

标签: sql sql-server sql-server-2005

我从查询中返回以下数据。基本上我把它放在一个临时表中,所以它现在在一个临时表中,我可以查询(显然在现实生活中有更多的数据,我只是在展示一个例子):

EmpId      Date
1        2011-01-01
1        2011-01-02
1        2011-01-03
2        2011-02-03
3        2011-03-01
4        2011-03-02
5        2011-01-02

我只需要返回日期列中连续30天或以上的EmpId。我还需要返回连续30天或以上的员工的日计数。可能存在2组或更多组不同的连续天数,即30天或更多天。在这个例子中,我想返回多行。因此,如果员工的日期是2011-01-01到2011-02-20,那么请将此计数和计数一行返回。然后,如果该员工的日期为2011-05-01至2011-07-01,则将其返回到另一行。基本上所有连续几天的休息都被视为一个单独的记录。

4 个答案:

答案 0 :(得分:4)

使用DENSE_RANK应该可以解决问题:

;WITH sampledata
    AS (SELECT 1 AS id, DATEADD(day, -0, GETDATE())AS somedate
        UNION ALL SELECT 1, DATEADD(day, -1, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -2, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -3, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -4, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -5, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -10, GETDATE())
        UNION ALL SELECT 1, '2011-01-01 00:00:00'
        UNION ALL SELECT 1, '2010-12-31 00:00:00'
        UNION ALL SELECT 1, '2011-02-01 00:00:00'
        UNION ALL SELECT 1, DATEADD(day, -10, GETDATE())
        UNION ALL SELECT 2, DATEADD(day, 0, GETDATE())
        UNION ALL SELECT 2, DATEADD(day, -1, GETDATE())
        UNION ALL SELECT 2, DATEADD(day, -2, GETDATE())
        UNION ALL SELECT 2, DATEADD(day, -6, GETDATE())
        UNION ALL SELECT 3, DATEADD(day, 0, GETDATE())
        UNION ALL SELECT 4, DATEADD(day, 0, GETDATE())
        UNION ALL SELECT 5, DATEADD(day, 0, GETDATE()))
   , ranking
    AS (SELECT *, DENSE_RANK()OVER(PARTITION BY id ORDER BY DATEDIFF(day, 0, somedate)) - DATEDIFF(day, 0, somedate)AS dategroup
          FROM sampledata)
    SELECT id
         , MIN(somedate)AS range_start
         , MAX(somedate)AS range_end
         , DATEDIFF(day, MIN(somedate), MAX(somedate)) + 1 AS consecutive_days
      FROM ranking
     GROUP BY id, dategroup
     --HAVING DATEDIFF(day, MIN(somedate), MAX(somedate)) + 1 >= 30 --change as needed
     ORDER BY id, range_start

答案 1 :(得分:1)

这样的事情应该可以解决问题,但是还没有测试过。

SELECT 
  a.empid
  , count(*) as consecutive_count
  , min(a.mydate) as startdate
FROM (SELECT * FROM logins ORDER BY mydate) a
INNER JOIN (SELECT * FROM logins ORDER BY mydate) b 
  ON (a.empid = b.empid AND datediff(day,a.mydate,b.mydate) = 1
GROUP BY a.empid, startdate
HAVING consecutive_count > 30

答案 2 :(得分:0)

这是递归CTE的一个很好的例子。我偷走了@Davin的数据表:

with data AS --sample data 
( SELECT 1 as id ,DATEADD(DD,-0,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-1,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-2,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-3,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-4,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-5,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-10,GETDATE()) as date UNION ALL 
SELECT 1 as id ,'2011-01-01 00:00:00.000' as date UNION ALL 
SELECT 1 as id ,'2010-12-31 00:00:00.000' as date UNION ALL 
SELECT 1 as id ,'2011-02-01 00:00:00.000' as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-10,GETDATE()) as date UNION ALL 
SELECT 2 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL 
SELECT 2 as id ,DATEADD(DD,-1,GETDATE()) as date UNION ALL 
SELECT 2 as id ,DATEADD(DD,-2,GETDATE()) as date UNION ALL 
SELECT 2 as id ,DATEADD(DD,-6,GETDATE()) as date UNION ALL 
SELECT 3 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL 
SELECT 4 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL 
SELECT 5 as id ,DATEADD(DD,0,GETDATE()) as date   ) 
,CTE AS
(
    SELECT id, CAST(date as date) Date, Consec = 1
    FROM data
    UNION ALL
    SELECT t.id, CAST(t.date as DATE) Date, Consec = (c.Consec + 1)
    FROM data T
    INNER JOIN CTE c
        ON T.id = c.id
        AND CAST(t.date as date) = CAST(DATEADD(day, 1, c.date) as date)

)

SELECT id, MAX(consec)
FROM CTE
GROUP BY id
ORDER BY id

基本上,这会为每个人生成大量行,并衡量每个日期代表的行数。

答案 3 :(得分:0)

假设同一雇员没有重复日期:

;WITH ranged AS (
  SELECT
    EmpId,
    Date,
    RangeId = DATEDIFF(DAY, 0, Date)
            - ROW_NUMBER() OVER (PARTITION BY EmpId ORDER BY Date)
  FROM atable
)
SELECT
  EmpId,
  StartDate = MIN(Date),
  EndDate   = MAX(Date),
  DayCount  = DATEDIFF(DAY, MIN(Date), MAX(Date)) + 1
FROM ranged
GROUP BY EmpId, RangeId
HAVING DATEDIFF(DAY, MIN(Date), MAX(Date)) + 1 >= 30
ORDER BY EmpId, MIN(Date)

DATEDIFF将日期转换为整数(0日期(1900-01-01)和Date之间的天数差异)。如果日期是连续的,则整数也是连续的。以问题中的数据样本为例,DATEDIFF结果将是:

EmpId  Date        DATEDIFF
-----  ----------  --------
1      2011-01-01  40542
1      2011-01-02  40543
1      2011-01-03  40544
2      2011-02-03  40575
3      2011-03-01  40601
4      2011-03-02  40602
5      2011-01-02  40543

现在,如果您获取每个员工的行,按日期顺序为它们分配行号,并获得数字表示和行号之间的差异,您会发现连续数字的差异保持不变(并且,因此,连续日期)。使用略有不同的样本进行更好的说明,它将如下所示:

Date        DATEDIFF  RowNum  RangeId
----------  --------  ------  -------
2011-01-01  40542     1       40541
2011-01-02  40543     2       40541
2011-01-03  40544     3       40541
2011-01-05  40546     4       40542
2011-01-07  40548     5       40543
2011-01-08  40549     6       40543
2011-01-09  40550     7       40543

RangeId的具体价值并不重要,只是连续日期保持不变这一事实。基于这一事实,您可以将其用作分组标准来计算组中的日期并获得范围界限。

上述查询使用DATEDIFF(DAY, MIN(Date), MAX(Date)) + 1来计算天数,但您也可以使用COUNT(*)代替。