确定连续和独立的PTO天数

时间:2018-09-12 01:19:13

标签: sql presto

基于反馈,我正在整理我的问题。

我正在使用Presto数据库上的SQL。

我的目标是报告自2018年初以来连续执行PTO或病假时间的员工。我的期望输出将是员工在开始和结束日期上花费的各个时间,包括:

enter image description here

我正在使用的主表是d_employee_time_off

enter image description here

只有两个time_off_type_name:PTO和病假。

ds是一个日期戳,我使用最新的ds(通常是当前日期)

我可以访问名为d_date的日期表

enter image description here

我可以在d_employee_time_off.time_off_date = d_date.full_date上加入表

我希望我以一种可以理解的方式来构造这个问题。

1 个答案:

答案 0 :(得分:1)

我认为这里的需要是将放假的材料加入日历表。

在下面的示例解决方案中,我正在“即时”生成此消​​息,但我认为您确实有自己的解决方案。同样在我的示例中,我使用了字符串“ Monday”并从中向后移动(或者,您可以使用“ Friday”并向前移动)。我并不热衷于依赖语言的解决方案,但由于我不是Presto用户,因此无法在Presto上进行任何测试。因此,以下示例使用了一些自己的逻辑,但是我相信可以使用SQL Server语法将其转换为Presto:

查询:

;WITH
Digits AS (
          SELECT 0 AS digit UNION ALL
          SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL  
          SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL 
          SELECT 9
          )
, cal AS (
          SELECT 
                 ca.number
               , dateadd(day,ca.number,'20180101') as cal_date
               , datename(weekday,dateadd(day,ca.number,'20180101')) weekday
          FROM Digits [1s]
          CROSS JOIN Digits [10s]
          CROSS JOIN Digits [100s] /* add more like this as needed */
          cross apply (
              SELECT 
                      [1s].digit 
                    + [10s].digit * 10
                    + [100s].digit * 100  /* add more like this as needed */
                    AS number
              ) ca
          )
, time_off AS (
        select
            *
        from cal
        inner join mytable t on (cal.cal_date = t.time_off_date and cal.weekday <> 'Monday') 
                             or (cal.cal_date between dateadd(day,-2,t.time_off_date) 
                                  and t.time_off_date and datename(weekday,t.time_off_date) = 'Monday')
        )
, starting_points AS (
        SELECT
            employee_id,
            cal_date,
            dense_rank() OVER(partition by employee_id
                ORDER BY
                    time_off_date
            ) AS rownum
        FROM
            time_off A
        WHERE
            NOT EXISTS (
                SELECT
                    *
                FROM
                    time_off B
                WHERE
                    B.employee_id = A.employee_id
                    AND B.cal_date = DATEADD(day, -1, A.cal_date)
            )
    )
, ending_points AS (
        SELECT
            employee_id,
            cal_date,
            dense_rank() OVER(partition by employee_id
                ORDER BY
                    time_off_date
            ) AS rownum
        FROM
            time_off A
        WHERE
            NOT EXISTS (
                SELECT
                    *
                FROM
                    time_off B
                WHERE
                    B.employee_id = A.employee_id
                    AND B.cal_date = DATEADD(day, 1, A.cal_date)
            )
    )
SELECT
    S.employee_id,
    S.cal_date AS start_range,
    E.cal_date AS end_range
FROM
    starting_points S
JOIN
    ending_points E
    ON E.employee_id = S.employee_id
    AND E.rownum = S.rownum
order by employee_id
    , start_range

结果:

    employee_id start_range end_range
1   200035      02.01.2018  02.01.2018 
2   200035      20.04.2018  27.04.2018 
3   200037      27.01.2018  29.01.2018 
4   200037      31.03.2018  02.04.2018 

请参阅:http://rextester.com/MISZ50793

CREATE TABLE mytable(
   ID INT NOT NULL
  ,employee_id      INTEGER  NOT NULL
  ,type             VARCHAR(3) NOT NULL
  ,time_off_date         DATE  NOT NULL
  ,time_off_in_days INT NOT NULL
);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (1,200035,'PTO','2018-01-02',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (2,200035,'PTO','2018-04-20',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (3,200035,'PTO','2018-04-23',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (4,200035,'PTO','2018-04-24',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (5,200035,'PTO','2018-04-25',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (6,200035,'PTO','2018-04-26',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (7,200035,'PTO','2018-04-27',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (8,200037,'PTO','2018-01-29',1);
INSERT INTO mytable(id,employee_id,type,time_off_date,time_off_in_days) VALUES (9,200037,'PTO','2018-04-02',1);