我在表格中有两个时间戳:
usage_from | usage_till
---------------------+--------------------
2013-10-09 23:08:17 | 2013-10-09 23:16:00
2013-10-09 23:08:17 | 2013-10-09 23:08:19
2013-10-09 23:08:17 | 2013-10-10 18:58:22
2013-10-09 23:08:17 | 2013-10-09 23:15:05
2013-10-09 23:08:17 | 2013-10-09 23:09:00
2013-10-09 23:08:17 | 2013-10-09 23:08:20
2013-10-09 23:08:17 | 2013-10-09 23:32:04
2013-10-09 23:08:17 | 2013-10-10 02:02:03
2013-10-09 23:08:17 | 2013-10-10 07:31:00
2013-10-09 23:08:17 | 2013-10-10 22:41:04
我需要分成如下:
usage_from | usage_till
---------------------+-----------------------
2013-10-09 23:08:17 | 2013-10-09 23:16:00
2013-10-09 23:08:17 | 2013-10-09 23:08:19
2013-10-09 23:08:17 | 2013-10-10 02:00:00
2013-10-10 02:00:00 | 2013-10-10 18:58:22 -- splitted
2013-10-09 23:08:17 | 2013-10-09 23:15:05
2013-10-09 23:08:17 | 2013-10-09 23:09:00
2013-10-09 23:08:17 | 2013-10-09 23:08:20
2013-10-09 23:08:17 | 2013-10-09 23:32:04
2013-10-09 23:08:17 | 2013-10-10 02:00:00
2013-10-10 02:00:00 | 2013-10-10 02:02:03 -- splitted
2013-10-09 23:08:17 | 2013-10-10 02:00:00
2013-10-10 02:00:00 | 2013-10-10 07:31:00 -- splitted
2013-10-09 23:08:17 | 2013-10-10 02:00:00
2013-10-10 02:00:00 | 2013-10-10 22:41:04 -- splitted
在上面的示例中,我将时间戳分为02:00:00。
经过多次试验后,我可以将其分割如下,但不能分成不同的行。
usage_from | usage_till | end_time_1 | end_time_2
---------------------+---------------------+---------------------+---------------------
2013-10-09 23:08:17 | 2013-10-09 23:16:00 | 2013-10-09 23:16:00 | 2013-10-11 02:00:00
2013-10-09 23:08:17 | 2013-10-09 23:08:19 | 2013-10-09 23:08:19 | 2013-10-11 02:00:00
2013-10-09 23:08:17 | 2013-10-10 18:58:22 | 2013-10-10 02:00:00 | 2013-10-10 18:58:22
2013-10-09 23:08:17 | 2013-10-09 23:15:05 | 2013-10-09 23:15:05 | 2013-10-11 02:00:00
2013-10-09 23:08:17 | 2013-10-09 23:09:00 | 2013-10-09 23:09:00 | 2013-10-11 02:00:00
2013-10-09 23:08:17 | 2013-10-09 23:08:20 | 2013-10-09 23:08:20 | 2013-10-11 02:00:00
2013-10-09 23:08:17 | 2013-10-09 23:32:04 | 2013-10-09 23:32:04 | 2013-10-11 02:00:00
2013-10-09 23:08:17 | 2013-10-10 02:02:03 | 2013-10-10 02:00:00 | 2013-10-10 02:02:03
2013-10-09 23:08:17 | 2013-10-10 07:31:00 | 2013-10-10 02:00:00 | 2013-10-10 07:31:00
2013-10-09 23:08:17 | 2013-10-10 22:41:04 | 2013-10-10 02:00:00 | 2013-10-10 22:41:04
知道怎么做吗?最近几天我一直在苦苦挣扎 我使用的是Redshift 1.0.757(基于PostgreSQL 8.02)。
答案 0 :(得分:2)
如果 1 Redshift支持generate_series()
的基本形式,这可能会有效。至少这在Postgres 8.3中有效:
SELECT CASE WHEN split > 0 AND g = 0 THEN usage_from
WHEN split > 0 AND g = 1 THEN usage_till::date + '2:0'::time
ELSE usage_from END
, CASE WHEN split > 0 AND g = 0 THEN usage_till::date + '2:0'::time
WHEN split > 0 AND g = 1 THEN usage_till
ELSE usage_till END
FROM (
SELECT * , generate_series(0, split) AS g
FROM (
SELECT *
, (usage_till - '2:0'::time)::date
- (usage_from - '2:0'::time)::date AS split -- results in integer
FROM t
) sub1
) sub2
在内部子查询sub1
中,我查找时间范围是否超过凌晨2点,并将其保存在split
列中。我假设时间范围从未超过凌晨2点两次,但查询可以很容易地适应那个。 generate_series()
每次自动生成1行。
在下一个子查询sub2
generate_series()
中生成需要拆分的两行。
在外部SELECT中,CASE语句会相应地调整时间戳。
通常我会使用interval '2 hours'
代替'2:0'::time
,但我似乎记得Redshift不支持interval
类型。
如果Redshift仅允许generate_series()
列表中的FROM
而不是SELECT
列表中的LATERAL JOIN
,那么您就不走运了。这已经是古老的形式了。在现代Postgres中,您将使用{{1}}。你可以试试regexp_split_to_table()的运气,但这也不在Postgres 8.0中。
1 But the manual says, generate_series()
is unsupported.
除此之外,我只能想到PL / pgSQL的程序解决方案。但Redshift也可能在那里受到限制......