Question

我有这张桌子：

prod | customer |   city  | num  |       time         | isextra
-----+----------+---------+------+--------------------+-------
 1   | Jim      |  Venice |  5   |2015-08-27 1:10:00  | 0
 1   | Jim      |  Venice |  5   |2015-08-27 1:10:15  | 0
 1   | Jim      |  Venice |  5   |2015-08-27 1:10:28  | 0
 4   | Jane     |  Vienna |  8   |2018-06-04 2:20:43  | 0
 4   | Jane     |  Vienna |  8   |2018-06-04 2:20:43  | 0
 4   | Jane     |  Vienna |  8   |2018-06-04 2:20:49  | 0
 4   | Jane     |  Vienna |  8   |2018-06-04 2:30:55  | 0
 7   | Jack     | Vilnius |  4   |2015-09-15 2:20:55  | 0
 7   | Jake     |   Vigo  |  9   |2018-01-01 10:20:05 | 0
 7   | Jake     |   Vigo  |  2   |2018-01-01 10:20:25 | 0

现在获取prod，customer，city，num之类的所有行，然后是时间在组中第一行的30秒内的所有行，其中＆＃39; isextra＆＃39;字段已更新为1，结果如下：

prod | customer |   city  | num  |       time         | isextra
-----+----------+---------+------+--------------------+-------
 1   | Jim      |  Venice |  5   |2015-08-27 1:10:00  | 0
 1   | Jim      |  Venice |  5   |2015-08-27 1:10:15  | 1
 1   | Jim      |  Venice |  5   |2015-08-27 1:10:28  | 1
 4   | Jane     |  Vienna |  8   |2018-06-04 2:20:43  | 0
 4   | Jane     |  Vienna |  8   |2018-06-04 2:20:43  | 1
 4   | Jane     |  Vienna |  8   |2018-06-04 2:20:49  | 1
 4   | Jane     |  Vienna |  8   |2018-06-04 2:30:55  | 0
 7   | Jack     | Vilnius |  4   |2015-09-15 2:20:55  | 0
 7   | Jake     |   Vigo  |  9   |2018-01-01 10:20:05 | 0
 7   | Jake     |   Vigo  |  2   |2018-01-01 10:20:25 | 0

这是表格和数据：

create table mytable (prod int, customer varchar, city varchar, num int, time timestamp, isextra smallint);


insert into mytable values (1, 'Jim', 'Venice', 5, '2015-08-27 1:10:00',  0);
insert into mytable values (1, 'Jim', 'Venice',  5, '2015-08-27 1:10:15',  0);
insert into mytable values (1, 'Jim', 'Venice',  5, '2015-08-27 1:10:28',  0);
insert into mytable values (4, 'Jane',  'Vienna',   8,   '2018-06-04 2:20:43',   0);
insert into mytable values (4, 'Jane',  'Vienna',   8,   '2018-06-04 2:20:43',   0);
insert into mytable values (4, 'Jane',  'Vienna',   8,   '2018-06-04 2:20:49',   0);
insert into mytable values (4, 'Jane',  'Vienna',   8,   '2018-06-04 2:30:55',   0);
insert into mytable values (7, 'Jack', 'Vilnius',   4,   '2015-09-15 2:20:55',   0);
insert into mytable values (7, 'Jake',   'Vigo',    9,   '2018-01-01 10:20:05',  0);
insert into mytable values (7, 'Jake',   'Vigo',    2,   '2018-01-01 10:20:25',  0);

到目前为止，我所拥有的只是：

UPDATE mytable
SET isextra = 1
FROM ( 
  select *, 
  row_number() over (partition by prod, customer, city, num order by time asc)
    as t from mytable
) AS sequence

被困在这里......

任何想法都表示赞赏，谢谢！

Answer 1

我会使用窗口函数编写select：

select t.*,
       (case when time > min_time and
                  time < dateadd(minute, 30, min_time)
             then 1 else 0 
        end) as is_extra
from (select t.*,
             min(time) over (partition by prod, customer, city, num) as min_time
      from t
     ) t;

唯一的问题是同一行的重复次数。我们可以解决这个问题：

select t.*,
       (case when time > min_time and time < dateadd(minute, 30, min_time) and seqnum <> 0
             then 1 else 0 
        end) as is_extra
from (select t.*,
             min(time) over (partition by prod, customer, city, num) as min_time,
             row_number() over (partition by prod, customer, city, num order by time) as seqnum
      from t
     ) t;

不幸的是，将其转换为update非常棘手，因为您的示例中存在完全重复的行。

如果每行都有唯一的ID，则可以将其转换为update：

update t
    set t.is_extra = tt.new_is_extra
    from (select t.*,
                 (case when time > min_time and time < dateadd(minute, 30, min_time) and seqnum <> 0
             then 1 else 0 
                  end) as new_is_extra
          from (select t.*,
                       min(time) over (partition by prod, customer, city, num) as min_time,
                       row_number() over (partition by prod, customer, city, num order by time) as seqnum
                from t
                ) t
         ) tt
     where t.id = tt.id

Answer 2

不知道这是否会在PostgreSQL 8.0（红移）中出现，但值得一试：

update mytable a
  set isextra = 1 
  from (
    select prod, customer, city, num, min(time) as mintime
      from mytable 
      group by prod, customer, city, num
  ) b
  where a.prod = b.prod
    and a.customer = b.customer
    and a.city = b.city
    and a.num = b.num
    and a.time <= b.mintime + interval '30 seconds'
    and a.time <> b.mintime;

结果：

prod  customer  city     num  time                   isextra
----  --------  ----     ---  ---------------------  -------
1     Jim       Venice   5    2015-08-27 01:10:00.0  0
1     Jim       Venice   5    2015-08-27 01:10:15.0  1
1     Jim       Venice   5    2015-08-27 01:10:28.0  1
4     Jane      Vienna   8    2018-06-04 02:20:43.0  0
4     Jane      Vienna   8    2018-06-04 02:20:43.0  0
4     Jane      Vienna   8    2018-06-04 02:20:49.0  1
4     Jane      Vienna   8    2018-06-04 02:30:55.0  0
7     Jack      Vilnius  4    2015-09-15 02:20:55.0  0
7     Jake      Vigo     2    2018-01-01 10:20:25.0  0
7     Jake      Vigo     9    2018-01-01 10:20:05.0  0

Redshift或ANY SQL：使用类似行更新表，其时间在30秒内

2 个答案: