展平范围为列的单元格值

时间:2019-06-26 12:32:12

标签: sql presto amazon-athena

我有一个表,该表的一列包含文档的页面范围,并且想要扩展该表,以便该范围内的每个页面都是其自己的行。

我有:

| document | type | page_range |
| -------- | ---- | ---------- |
|        1 |  A   |    1-3     |
|        2 |  B   |    4-5     |

我想要:

| document | type | pages |
| -------- | ---- | ----- |
|        1 |  A   |   1   |
|        1 |  A   |   2   |
|        1 |  A   |   3   |
|        2 |  B   |   4   |
|        2 |  B   |   5   |

1 个答案:

答案 0 :(得分:1)

您可以

  • 使用regexp_extract
  • 提取范围边界
  • 使用sequence
  • 将范围边界转换为值列表
  • 使用CROSS JOIN UNNEST拼合

赞:

SELECT id, x
FROM (VALUES ('A', '1-3'), ('B', '4-5')) t(id, range)
CROSS JOIN UNNEST (
    sequence(
        CAST(regexp_extract(range, '(\d+)-(\d+)', 1) AS bigint),
        CAST(regexp_extract(range, '(\d+)-(\d+)', 2) AS bigint))
) s(x);

示例输出:

presto> SELECT id, x
     -> FROM (VALUES ('A', '1-3'), ('B', '4-5')) t(id, range)
     -> CROSS JOIN UNNEST (
     ->     sequence(
     ->         CAST(regexp_extract(range, '(\d+)-(\d+)', 1) AS bigint),
     ->         CAST(regexp_extract(range, '(\d+)-(\d+)', 2) AS bigint))
     -> ) s(x);
 id | x
----+---
 A  | 1
 A  | 2
 A  | 3
 B  | 4
 B  | 5
(5 rows)