使用tabibitosan计算具有相似数据的最近连续行

时间:2018-05-24 18:06:52

标签: sql oracle group-by count partition

我的项目正在使用Oracle SQL数据库。我有一个历史表,每周附加任务状态,并试图查询当前偏离轨道的任务偏离轨道的周数。以下是我的源历史表中的示例摘录:

ID  WEEK    ON_TRACK
1   1   N
1   2   Y
1   3   N
1   4   N
1   5   N
2   1   N
2   2   N
2   3   Y
2   4   Y
2   5   N
3   1   N
3   2   N
3   3   Y
3   4   Y
3   5   Y

我希望返回连续的" N" ON_TRACK中的值从最近的追加开始向后。对于上面的示例数据,我希望返回的查询:

ID  WKS_OFF_TRACK
1   3
2   1
3   0

我做过一些研究,看起来Tabibitosan方法是最合乎逻辑的方法,而且我已经找到了足够的例子来给出符合1个标准的最大连续值,但我'我无法调整以返回符合2个条件(ID和ON_TRACK)的最新连续值。

这是我到目前为止所拥有的

--this step creates a temp table with unique IDs for each weekly append to the historical table, and a 1 (if ON_TRACK = N) or 0 (if ON_TRACK = Y). This results in the expected info.
WITH HIST_TBL AS (
    SELECT DISTINCT(ID),
    CASE ON_TRACK
        WHEN 'N' THEN 1
        ELSE 0
        END AS OFF_TRACK,
    WEEK 
    FROM SOURCE_HISTORICAL_TBL
    ORDER BY ID,WEEK DESC)
-- end of temp table 

--this is where Im struggling I want one line per project number, and the sum of the latest string of 1s (weeks the task has been off track), until a 0 is reached.
SELECT ID,
       SUM(OFF_TRACK) AS WKS_OFF_TRACK
FROM   (SELECT WEEK,
               ID,
               OFF_TRACK,
               ROW_NUMBER() OVER (ORDER BY WEEK DESC) - ROW_NUMBER() OVER 
(PARTITION BY ID,OFF_TRACK ORDER BY WEEK DESC) GRP
        FROM   HIST_TBL)
GROUP BY ID, GRP
ORDER BY ID;

此代码会导致每个项目偏离轨道的所有星期的累积总和,对于我的示例数据将是:

ID  WKS_OFF_TRACK
1   4
2   3
3   2

我出错的任何想法?

2 个答案:

答案 0 :(得分:1)

这是一种假设人们在轨道上的方法"在某个时间点:

select sht.id, count(*)
from SOURCE_HISTORICAL_TBL sht
where sht.week > (select max(sht2.week)
                  from SOURCE_HISTORICAL_TBL sht2
                  where sht2.id = sht.id and sht2.on_track = 'Y'
                )
group by sht.id;

否则,您还需要一个条件:

select sht.id, count(*)
from SOURCE_HISTORICAL_TBL sht
where sht.week > (select max(sht2.week)
                  from SOURCE_HISTORICAL_TBL sht2
                  where sht2.id = sht.id and sht2.on_track = 'Y'
                 ) or
      not exists (select 1
                  from SOURCE_HISTORICAL_TBL sht2
                  where sht2.id = sht.id and sht2.on_track = 'Y'
                 )
group by sht.id;

您也可以将这些短语称为分析函数:

select id,
       sum(case when week > max_week_y or max_week_y is null then 1 else 0 end) as max_off_track
from (select sht.*,
             max(case when on_track = 'Y' then week end) over (partition by id) as max_week_y
      from SOURCE_HISTORICAL_TBL sht
     ) sht
group by id;

请注意,此版本将为当前正在播放的人返回0

答案 1 :(得分:1)

您可以在单个表扫描中执行此操作:

SQL Fiddle

Oracle 11g R2架构设置

CREATE TABLE SOURCE_HISTORICAL_TBL ( ID, WEEK, ON_TRACK ) AS
SELECT 1, 1, 'N' FROM DUAL UNION ALL
SELECT 1, 2, 'Y' FROM DUAL UNION ALL
SELECT 1, 3, 'N' FROM DUAL UNION ALL
SELECT 1, 4, 'N' FROM DUAL UNION ALL
SELECT 1, 5, 'N' FROM DUAL UNION ALL
SELECT 2, 1, 'N' FROM DUAL UNION ALL
SELECT 2, 2, 'N' FROM DUAL UNION ALL
SELECT 2, 3, 'Y' FROM DUAL UNION ALL
SELECT 2, 4, 'Y' FROM DUAL UNION ALL
SELECT 2, 5, 'N' FROM DUAL UNION ALL
SELECT 3, 1, 'N' FROM DUAL UNION ALL
SELECT 3, 2, 'N' FROM DUAL UNION ALL
SELECT 3, 3, 'Y' FROM DUAL UNION ALL
SELECT 3, 4, 'Y' FROM DUAL UNION ALL
SELECT 3, 5, 'Y' FROM DUAL UNION ALL
SELECT 4, 1, 'N' FROM DUAL UNION ALL
SELECT 5, 1, 'Y' FROM DUAL;

查询1

SELECT ID,
       GREATEST(
         COALESCE( MAX( CASE ON_TRACK WHEN 'N' THEN WEEK END ), 0 )
         - COALESCE( MAX( CASE ON_TRACK WHEN 'Y' THEN WEEK END ), 0 ),
         0
       ) AS weeks
FROM   SOURCE_HISTORICAL_TBL
GROUP BY id
ORDER BY id

<强> Results

| ID | WEEKS |
|----|-------|
|  1 |     3 |
|  2 |     1 |
|  3 |     0 |
|  4 |     1 |
|  5 |     0 |