使用SQL确定连续值的总和

时间:2016-08-24 15:38:40

标签: sql hana

我想根据下表确定连续缺席的次数。初步研究表明,我可以使用窗口功能实现这一目标。对于提供的数据,最长条纹是连续四次出现。请您告诉我如何将运行缺勤总数设置为单独的列。

create table events (eventdate date, absence int);

insert into events values ('2014-10-01', 0);
insert into events values ('2014-10-08', 1);
insert into events values ('2014-10-15', 1);
insert into events values ('2014-10-22', 0);
insert into events values ('2014-11-05', 0);
insert into events values ('2014-11-12', 1);
insert into events values ('2014-11-19', 1);
insert into events values ('2014-11-26', 1);
insert into events values ('2014-12-03', 1);
insert into events values ('2014-12-10', 0);

4 个答案:

答案 0 :(得分:1)

您没有指定您正在使用的RDBMS,但以下内容适用于postgresql的窗口函数,并且应该可以转换为类似的SQL引擎:

SELECT eventdate,
       absence,
       -- XXX We take advantage of the fact that absence is an int (1 or 0)
       --     otherwise we'd COUNT(1) OVER (...) and only conditionally
       --     display the count if absence = 1
       SUM(absence) OVER (PARTITION BY span ORDER BY eventdate)
         AS consecutive_absences
  FROM (SELECT spanstarts.*,
               SUM(newspan) OVER (ORDER BY eventdate) AS span
          FROM (SELECT events.*,
                CASE LAG(absence) OVER (ORDER BY eventdate)
                  WHEN absence THEN NULL
                  ELSE 1 END AS newspan
                  FROM events)
                spanstarts
        ) eventsspans
ORDER BY eventdate;

给你:

 eventdate  | absence | consecutive_absences 
------------+---------+----------------------
 2014-10-01 |       0 |                    0
 2014-10-08 |       1 |                    1
 2014-10-15 |       1 |                    2
 2014-10-22 |       0 |                    0
 2014-11-05 |       0 |                    0
 2014-11-12 |       1 |                    1
 2014-11-19 |       1 |                    2
 2014-11-26 |       1 |                    3
 2014-12-03 |       1 |                    4
 2014-12-10 |       0 |                    0

pgsql-general mailing list上对上述方法进行了很好的剖析。缺点是:

  1. 最内层查询(spanstarts)使用LAG查找新的开头 缺席的范围,无论是1的跨度还是跨度0'
  2. 下一个查询(eventsspans)通过汇总我们面前的新跨度数来确定这些跨度。所以,我们找到span 1,然后是span 2,然后是3,等等。
  3. 外部查询计算每个范围内的缺勤数。
  4. 正如SQL评论所说,我们在#3上做了一些利用它的数据类型,但净效果是一样的。

答案 1 :(得分:1)

根据Gordon Linhoff的回答here,您可以这样做:

SELECT TOP 1
        MIN(eventdate) AS spanStart ,
        MAX(eventdate) AS spanEnd,
        COUNT(*) AS spanLength
FROM    ( SELECT    e.* ,
                    ( ROW_NUMBER() OVER ( ORDER BY eventdate )
                      - ROW_NUMBER() OVER ( PARTITION BY absence ORDER BY eventdate ) ) AS grp
          FROM      #events e
        ) t
GROUP BY grp ,
        absence
HAVING  absence = 1
ORDER BY COUNT(*) DESC;

返回:

spanStart   | spanEnd   | spanLength
---------------------------------------
2014-11-12  |2014-12-03 | 4

答案 2 :(得分:0)

我不知道您的DBMS是什么,但这是来自SQLServer。希望它有一些帮助:)

-------------------------------------------------------------------------------------------
Query:

--tableRN is used to get the rownumber
;with tableRN as (SELECT a.*, ROW_NUMBER() OVER (ORDER BY a.event) as rn, COUNT(*) as maxRN
                 FROM absence a GROUP BY a.event, a.absence),

--cte is a recursive function that returns the...
--absence value, the level (amount of times 1 appeared in a row)
--rn (row number), total (total count
cte (absence, level, rn, total) AS (
SELECT 0, 0, 1, 0
UNION ALL 
SELECT r.absence, 
       CASE WHEN c.absence = 1 AND r.absence = 1 THEN level + 1
                                                 ELSE 0
       END, 
       c.rn + 1, 
       CASE WHEN c.level = 1 THEN total + 1
                             ELSE total
       END
FROM cte c JOIN tableRN r ON c.rn + 1 = r.rn)

--This gets you the total count of times there 
--was a consective absent (twice or more in a row).
SELECT MAX(c.total) AS Count FROM cte c

-------------------------------------------------------------------------------------------
Results:

|Count|
+-----+
|  2  |

答案 3 :(得分:-1)

创建一个名为consecutive_absence_count的新列,默认为0。

您可以为插入编写SQL过程 - 获取最新记录,检索缺席值,确定要插入的新记录是否具有存在或不存在的值。

如果他们最新且新记录的连续日期和缺席值设置为0,请将consecutive_absence_count增加,否则将其设置为0。