在Redshift中将表格数据转换为JSON

时间:2017-07-08 07:28:47

标签: amazon-redshift

我无法弄清楚如何将表格数据转换为JSON格式并将其存储在Redshift中的另一个表中。例如,我有一个"DEMO"表,其中包含四列:pid,stid,item_id,trans_id

对于pid,stid,item_id的每个组合,存在许多trans_id s。

pid  stid  item_id  trans_id :

1 , AB  , P1 , T1  
1 , AB  , P1 , T2  
1 , AB  , P1 , T3     
1 , AB  , P1 , T4   
2 , ABC , P2 , T5  
2 , ABC , P2 , T6  
2 , ABC , P2 , T7  
2 , ABC , P2 , T8

我想将此数据存储在名为"SAMPLE"的另一个表中:

pid  stid  item_id      trans_id

1 , AB  , P1 , {"key1":T1, "key2":"T2" "key2":"T3" "key2":"T4"}    
2 , ABC , P2 , {"key1":T5, "key2":"T6" "key2":"T7" "key2":"T8"}

我无法弄清楚如何使用Redshift中的SQL查询仅以{J}格式将数据从"DEMO"加载到"SAMPLE"。我不想使用任何中间文件。

2 个答案:

答案 0 :(得分:1)

LISTAGG聚合函数,允许您在组内连接文本值。它允许有效构造JSON对象:

SELECT
 pid
,stid
,item_id
,'{'||listagg(
    '"key'||row_number::varchar||'":'||trans_id::varchar
    ,',') within group (order by row_number)
 ||'}'
FROM (
    SELECT *, row_number() over (partition by pid,stid,item_id order by trans_id)
    FROM "DEMO"
)
GROUP BY 1,2,3;

作为旁注,在这种特殊情况下,一系列交易ID可能会更好,您可以轻松地请求特定订单的元素,而无需使用keyN密钥:

WITH tran_arrays as (
    SELECT
     pid
    ,stid
    ,item_id
    ,listagg(trans_id::varchar,',') within group (order by trans_id) as tran_array
    FROM "DEMO"
    GROUP BY 1,2,3
)
SELECT *
,split_part(tran_array,',',1) as first_element
FROM tran_arrays;

答案 1 :(得分:0)

非常类似于现有的答案,但略有不同。此示例也是在Oracle数据库中运行的。我将工作放入其中并感觉分享,以防它可以帮助其他人。

/* Oracle Example */
WITH demo_data AS
(
  SELECT 1 AS pid, 'AB' AS stid, 'P1' AS item_id, 'T1' AS trans_id FROM dual UNION ALL
  SELECT 1 AS pid, 'AB' AS stid, 'P1' AS item_id, 'T2' AS trans_id FROM dual UNION ALL
  SELECT 1 AS pid, 'AB' AS stid, 'P1' AS item_id, 'T3' AS trans_id FROM dual UNION ALL
  SELECT 1 AS pid, 'AB' AS stid, 'P1' AS item_id, 'T4' AS trans_id FROM dual UNION ALL
  SELECT 2 AS pid, 'ABC' AS stid, 'P2' AS item_id, 'T5' AS trans_id FROM dual UNION ALL
  SELECT 2 AS pid, 'ABC' AS stid, 'P2' AS item_id, 'T6' AS trans_id FROM dual UNION ALL
  SELECT 2 AS pid, 'ABC' AS stid, 'P2' AS item_id, 'T7' AS trans_id FROM dual UNION ALL
  SELECT 2 AS pid, 'ABC' AS stid, 'P2' AS item_id, 'T8' AS trans_id FROM dual
)
, transformData AS
(
SELECT pid, stid, item_id, trans_id, rownum AS keyNum FROM demo_data
)

SELECT pid, stid, item_id
  , '{'||
    LISTAGG(CHR(34)||'key'||keynum||CHR(34)||':'||CHR(34)||trans_id||CHR(34), ' ')
    WITHIN GROUP (ORDER BY pid)
    ||'}' AS trans_id

FROM transformData
GROUP BY pid, stid, item_id
;

输出将如下所示:

enter image description here