Redshift-多列至行(不可透视)

时间:2018-10-12 14:13:22

标签: amazon-redshift tableau unpivot

在Redshift中:

我有一个表,其中包含30个维度字段和150多个度量字段。
为了在可视化工具(Tableau)中充分利用这些数据,我需要将度量标准列取消透视,使其仅分为一个度量和一个维度以对其进行分类。

简短示例:

   Date         Country    Order     Banana  Apple  Orange  Kiwi Lemon

    1-10-2018    Belgium    XYZ789    14       0     10      16    7
    1-10-2018    Germany    ABC123    10      15      3      15    3
    2-10-2018    Belgium    KLM456     9       9      7       1    7

结果:

   Date         Country    Order     Measure_Name   Measure_Value
    1-10-2018    Belgium    XYZ789    Banana         14
    1-10-2018    Belgium    XYZ789    Apple           0
    1-10-2018    Belgium    XYZ789    Orange         10
    1-10-2018    Belgium    XYZ789    Kiwi           16
    1-10-2018    Belgium    XYZ789    Lemon           7
    1-10-2018    Germany    ABC123    Banana         10
    1-10-2018    Germany    ABC123    Apple          15
    1-10-2018    Germany    ABC123    Orange          3
    1-10-2018    Germany    ABC123    Kiwi           15
    1-10-2018    Germany    ABC123    Lemon           3
    2-10-2018    Belgium    KLM456    Banana          9
    2-10-2018    Belgium    KLM456    Apple           9
    2-10-2018    Belgium    KLM456    Orange          7
    2-10-2018    Belgium    KLM456    Kiwi            1
    2-10-2018    Belgium    KLM456    Lemon           7

我知道并且我已经尝试过'UNION ALL'解决方案,但是我的表数百万行,而超过150列无法枢转对于这个解决方案来说实在太大了。 (即使SQL的行长也超过了8k)

您有什么想法可以帮助我吗?

非常感谢

2 个答案:

答案 0 :(得分:1)

鉴于您有150列要转置,所以我认为用SQL这样做是不可行的。我有几乎相同的确切情况,并使用python来解决它。这个问题中的伪代码和解释

Redshift. How can we transpose (dynamically) a table from columns to rows?

答案 1 :(得分:0)

以“命令式”方式编写此代码时,您希望生成一行中的更多行,可能使用类似 flatMap 的内容(或编程语言)。要在SQL中生成行,必须使用JOIN

这个问题可以通过({CROSSJOIN与另一个表进行解决,该表的行数与取消透视的列数相同。您需要添加一些条件魔术和 Voila!

CREATE TABLE t (
  "Date" date, 
  "Country" varchar, 
  "Order" varchar, 
  "Banana" varchar, 
  "Apple" varchar, 
  "Orange" varchar, 
  "Kiwi" varchar, 
  "Lemon" varchar
);

INSERT INTO t VALUES ('1-10-2018', 'Belgium', 'XYZ789', '14', '0', '10', '16', '7');
INSERT INTO t VALUES ('1-10-2018', 'Germany', 'ABC123', '10', '15', '3', '15', '3');
INSERT INTO t VALUES ('2-10-2018', 'Belgium', 'KLM456', '9', '9', '7', '1', '7');

WITH 
    cols as (
      select 'Banana' as c
      union all 
      select 'Apple' as c
      union all 
      select 'Orange' as c
      union all 
      select 'Kiwi' as c
      union all 
      select 'Lemon' as c
      )
select 
    "Date", 
    "Country", 
    "Order",
    CASE c 
        WHEN 'Banana' THEN "Banana" 
        WHEN 'Apple' THEN "Apple"
        WHEN 'Orange' THEN "Orange"
        WHEN 'Kiwi' THEN "Kiwi"
        WHEN 'Lemon' THEN "Lemon"
        ELSE NULL
    END as v

from t cross join cols;

https://www.db-fiddle.com/f/kojuPAjpS5twCKXSPVqYyP/2