Presto SQL-条件后每行计数1,条件前每行计数1

时间:2018-08-08 22:22:28

标签: sql presto

我有这样的数据:

date         group   state value
2018-01-01   A       A        20
2018-01-02   A       A        0
2018-01-03   A       A        0
2018-01-04   A       B        70
2018-01-05   A       B        0
2018-01-06   A       B        80

我想将每个日期从状态A移到状态B,其中第一个日期= 0且后一天= 1,依此类推,以每个日期为1。我还希望将-1之前的日期计数。我还想通过“组”列进行此操作,以确保每个组都有单独的计数。

这将是输出:

date         group   state value  count
2018-01-01   A       A        20 -3    
2018-01-02   A       A        0  -2
2018-01-03   A       A        0  -1
2018-01-04   A       B        70  0
2018-01-05   A       B        0   1
2018-01-06   A       B        80  2

我尝试过这样的事情:

SELECT ROW_NUMBER() OVER (PARTITION BY group, state ORDER BY date)

但是我最后以1的列结尾。

3 个答案:

答案 0 :(得分:1)

尝试使用this逻辑:

SELECT t.*,
       ( case when ( group_ = 'A' and state = 'B' ) then
          ROW_NUMBER() OVER (PARTITION BY group_, state ORDER BY date)
         else
          ROW_NUMBER() OVER (PARTITION BY group_, state) -
          COUNT(1) over (PARTITION BY group_, state) 
         end ) - 1 as count
  FROM tab t;

答案 1 :(得分:1)

如果只有一个过渡,我建议:

select t.*,
       (seqnum - max(case when state = 'B' then seqnum end) over ()) as counter
from (select t.*, row_number() over (order by date) as seqnum
      from t
     ) t;

换句话说,生成一个序列序列。然后减去B首先出现的值。

答案 2 :(得分:0)

您可以将SUM窗口功能一起使用。

此sqlfiddle是SQL服务器,但是prestodb也支持 windows函数,只需将DATEDIFF转换为date_diff函数即可。

CREATE TABLE T(
  date DATE,
    [group] VARCHAR(50),
    state VARCHAR(50),
    value INT
);




INSERT INTO T VALUES ('2018-01-01','A','A' ,20);
INSERT INTO T VALUES ('2018-01-02','A','A' ,0);
INSERT INTO T VALUES ('2018-01-03','A','A' ,0);
INSERT INTO T VALUES ('2018-01-04','A','B' ,70);
INSERT INTO T VALUES ('2018-01-05','A','B' ,0);
INSERT INTO T VALUES ('2018-01-06','A','B' ,80);

查询1

SELECT *,SUM(CASE 
             WHEN state = 'B' AND MINDT = date  THEN 0
             WHEN  state = 'B' THEN 1
             else -1 end
            ) OVER(PARTITION BY [group], state ORDER BY 
                   CASE WHEN state = 'B' THEN date_diff(day,MAXDT,date)
                        WHEN state = 'A' THEN date_diff(day,date,MINDT)
                   END)  count
FROM (
  SELECT *, 
         MAX(date) over(PARTITION BY [group], state ORDER BY date desc) MAXDT,
         MIN(date) over(PARTITION BY [group], state ORDER BY date) MINDT
  FROM T
) tt
order by date

Results

|       date | group | state | value |      MAXDT |      MINDT | count |
|------------|-------|-------|-------|------------|------------|-------|
| 2018-01-01 |     A |     A |    20 | 2018-01-03 | 2018-01-01 |    -3 |
| 2018-01-02 |     A |     A |     0 | 2018-01-03 | 2018-01-01 |    -2 |
| 2018-01-03 |     A |     A |     0 | 2018-01-03 | 2018-01-01 |    -1 |
| 2018-01-04 |     A |     B |    70 | 2018-01-06 | 2018-01-04 |     0 |
| 2018-01-05 |     A |     B |     0 | 2018-01-06 | 2018-01-04 |     1 |
| 2018-01-06 |     A |     B |    80 | 2018-01-06 | 2018-01-04 |     2 |