我有一些如下所示的数据:
userid | listno | market | owned | time_stamp |
--------+-----------+---------------+-----------+----------------
A | 1234 | 1 | 0 | 2018-02-21 11:22:59 |
A | 1234 | 1 | 0 | 2018-03-15 01:11:59 |
A | 1234 | 1 | 1 | 2018-03-04 15:07:10 |
A | 1234 | 1 | 0 | 2018-03-07 02:33:36 |
A | 1234 | 1 | 0 | 2018-03-08 21:37:21 |
A | 1234 | 1 | 1 | 2018-03-08 21:50:44 |
A | 1234 | 1 | 0 | 2018-03-10 06:29:41 |
A | 1234 | 1 | 0 | 2018-03-11 12:33:42 |
A | 1234 | 1 | 0 | 2018-03-13 00:32:57 |
A | 1234 | 1 | 0 | 2018-03-14 08:05:20 |
A | 1234 | 1 | 0 | 2018-02-18 08:00:27 |
A | 1234 | 1 | 1 | 2018-02-18 15:01:43 |
A | 1234 | 1 | 0 | 2018-02-19 21:14:26 |
A | 1234 | 1 | 1 | 2018-03-14 10:41:41 |
A | 1234 | 1 | 1 | 2018-03-16 00:55:45 |
A | 1234 | 1 | 0 | 2018-03-16 01:00:25 |
A | 1234 | 1 | 1 | 2018-03-16 01:05:18 |
A | 1234 | 1 | 0 | 2018-03-16 01:11:16 |
A | 1234 | 1 | 1 | 2018-03-16 01:21:14 |
我想按小时间隔进行分组,然后进行一些计算。我知道如何编写计算但是到达正确的分组会导致我出现问题。我想通过下一个值引导每个时间戳,但也要将该小时的最低时间戳舍入为小时,并将该小时的最大值时间戳舍入到该小时的第59分钟。 这是我使用的查询:
SELECT userid, listno, market, owned, time_stamp, lead(time_stamp, 1)
OVER (PARTITION BY userid, listno, market, date_trunc('hour', time_stamp)
ORDER BY time_stamp asc) AS next_ts FROM tableA ORDER BY listno,
time_stamp asc;
那个问题让我产生了这个:
userid | listno | market | owned | time_stamp | next_ts
--------+-----------+---------------+-----------+---------------------+---------------------
A | 1234 | 1 | 0 | 2018-02-21 11:22:59 |
A | 1234 | 1 | 0 | 2018-03-15 01:11:59 |
A | 1234 | 1 | 1 | 2018-03-04 15:07:10 |
A | 1234 | 1 | 0 | 2018-03-07 02:33:36 |
A | 1234 | 1 | 0 | 2018-03-08 21:37:21 | 2018-03-08 21:50:44
A | 1234 | 1 | 1 | 2018-03-08 21:50:44 |
A | 1234 | 1 | 0 | 2018-03-10 06:29:41 |
A | 1234 | 1 | 0 | 2018-03-11 12:33:42 |
A | 1234 | 1 | 0 | 2018-03-13 00:32:57 |
A | 1234 | 1 | 0 | 2018-03-14 08:05:20 |
A | 1234 | 1 | 0 | 2018-02-18 08:00:27 |
A | 1234 | 1 | 1 | 2018-02-18 15:01:43 |
A | 1234 | 1 | 0 | 2018-02-19 21:14:26 |
A | 1234 | 1 | 1 | 2018-03-14 10:41:41 |
A | 1234 | 1 | 1 | 2018-03-16 00:55:45 |
A | 1234 | 1 | 0 | 2018-03-16 01:00:25 | 2018-03-16 01:05:18
A | 1234 | 1 | 1 | 2018-03-16 01:05:18 | 2018-03-16 01:11:16
A | 1234 | 1 | 0 | 2018-03-16 01:11:16 | 2018-03-16 01:21:14
A | 1234 | 1 | 1 | 2018-03-16 01:21:14 | 2018-03-16 01:37:38
但我想要的是next_ts
列在需要的地方向上或向下舍入:
userid | listno | market | owned | time_stamp | next_ts
--------+-----------+---------------+-----------+---------------------+---------------------
A | 1234 | 1 | 0 | 2018-02-21 11:22:59 |
A | 1234 | 1 | 0 | 2018-03-15 01:11:59 |
A | 1234 | 1 | 1 | 2018-03-04 15:07:10 |
A | 1234 | 1 | 0 | 2018-03-07 02:33:36 |
A | 1234 | 1 | 0 | 2018-03-08 21:37:21 | 2018-03-08 21:59:59
A | 1234 | 1 | 1 | 2018-03-08 21:50:44 |
A | 1234 | 1 | 0 | 2018-03-10 06:29:41 |
A | 1234 | 1 | 0 | 2018-03-11 12:33:42 |
A | 1234 | 1 | 0 | 2018-03-13 00:32:57 |
A | 1234 | 1 | 0 | 2018-03-14 08:05:20 |
A | 1234 | 1 | 0 | 2018-02-18 08:00:27 |
A | 1234 | 1 | 1 | 2018-02-18 15:01:43 |
A | 1234 | 1 | 0 | 2018-02-19 21:14:26 |
A | 1234 | 1 | 1 | 2018-03-14 10:41:41 |
A | 1234 | 1 | 1 | 2018-03-16 00:55:45 |
A | 1234 | 1 | 0 | 2018-03-16 01:00:25 | 2018-03-16 01:00:00
A | 1234 | 1 | 1 | 2018-03-16 01:05:18 | 2018-03-16 01:11:16
A | 1234 | 1 | 0 | 2018-03-16 01:11:16 | 2018-03-16 01:21:14
A | 1234 | 1 | 1 | 2018-03-16 01:21:14 | 2018-03-16 01:59:59
我将如何实现这一目标?
答案 0 :(得分:1)
这个想法是:
row_number
函数识别第一行和最后一行case
语句中的行号来修改所需的时间戳准确生成您指定的输出:
WITH
ordering as (
SELECT userid, listno, market, owned, time_stamp, lead(time_stamp, 1)
OVER (PARTITION BY userid, listno, market, date_trunc('hour', time_stamp)
ORDER BY time_stamp asc) AS next_ts
,row_number() over OVER (PARTITION BY userid, listno, market, date_trunc('hour', time_stamp)
ORDER BY time_stamp asc) AS rnum_asc
,row_number() over OVER (PARTITION BY userid, listno, market, date_trunc('hour', time_stamp)
ORDER BY time_stamp desc) AS rnum_desc
FROM tableA
)
SELECT
userid, listno, market, owned, time_stamp
,case
when rnum_asc=1 then date_trunc('hour',next_ts)
when rnum_desc=2 then date_trunc('hour',next_ts)+interval '59 minutes 59 seconds'
else next_ts
end as next_ts
FROM ordering
ORDER BY listno, time_stamp asc;
然而,
部分 2018-03-16 01:00:25 | 2018-03-16 01:00:00
2018-03-16 01:05:18 | 2018-03-16 01:11:16
对我来说似乎很奇怪,因为next_ts
早于time_stamp
。您似乎正在尝试从事件流构建间隔,而您实际需要的是向下舍入第一个time_stamp
而不是第一个next_ts
,因此您有一系列连续的间隔开始从00:00开始到59:59结束。为此,您只需稍微重写上述语句({1}}和time_stamp
列的CASE语句。这个想法保持不变。