SQL帮助,获取所需的输出

时间:2015-06-20 18:07:13

标签: sql amazon-redshift

输入:

+---------+---------+--------+
| row_min | row_max | tCount |
+---------+---------+--------+
|       2 |       4 |      1 |
|       7 |      10 |      2 |
|      13 |      14 |      3 |
+---------+---------+--------+

必需输出:

+-----+--------+
| row | tcount |
+-----+--------+
|   2 |      1 |
|   3 |      1 |
|   4 |      1 |
|   7 |      2 |
|   8 |      2 |
|   9 |      2 |
|  10 |      2 |
|  13 |      3 |
|  14 |      3 |
+-----+--------+

row_min和row_max在输出中展开,其范围内对应的tcount 这个步骤是数据转换的一部分,我需要在数据集上使用SQL(驻留在Amazon redshift中的数据)。我坚持这个特殊的步骤。 请提供相同所需的SQL代码,希望仅限于使用连接和分析函数。

1 个答案:

答案 0 :(得分:2)

您可以使用足够大的计数表来包含数字,作为表格的MAX(row_max)高:

WITH Tally AS (
   SELECT ROW_NUMBER() OVER() AS n
   FROM (
      SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
      SELECT 1 UNION ALL SELECT 1 UNION ALL
      SELECT 1 UNION ALL SELECT 1 UNION ALL 
      SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ) x(n)
   CROSS JOIN (
      SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
      SELECT 1 UNION ALL SELECT 1 UNION ALL
      SELECT 1 UNION ALL SELECT 1 UNION ALL 
      SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ) y(n)
)
SELECT n, tCount
FROM Tally AS t
INNER JOIN mytable AS m ON t.n >= m.row_min AND t.n <= m.row_max

我认为Redshift支持简单的,非递归的CTE,所以上面应该可行。

Demo here