Bigquery:将多列中的数据转换为行格式

时间:2019-07-19 07:30:18

标签: google-analytics google-bigquery

假设BQ中的下表:

SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7

因此,该表可能多达100列左右。

我想转换此查询,使之成为结果表:

SELECT "Desktop" AS Device, 24 AS Nr UNION ALL
SELECT "Desktop" AS Device, 9 AS Nr UNION ALL
SELECT "Desktop" AS Device, 28 AS Nr UNION ALL
SELECT "Desktop" AS Device, 7 AS Nr UNION ALL
SELECT "Desktop" AS Device, 98 AS Nr UNION ALL
SELECT "Desktop" AS Device, 77 AS Nr UNION ALL
SELECT "Desktop" AS Device, 59 AS Nr UNION ALL
SELECT "Mobile" AS Device, 8 AS Nr UNION ALL
SELECT "Mobile" AS Device, 43 AS Nr UNION ALL
SELECT "Mobile" AS Device, 75 AS Nr UNION ALL
Etc

有人知道如何实现吗?

2 个答案:

答案 0 :(得分:1)

您可以将数字列转换为数组并使用UNNEST:

with raw as (
SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7
)
select Device,  Nr
from raw
left join UNNEST ([col1, col2, col3,col4,col5,col6,col7]) Nr

答案 1 :(得分:1)

以下是用于BigQuery标准SQL的信息,这里的特别之处在于它不依赖于要透视的列的数量和名称

#standardSQL
WITH raw AS (
  SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
  SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
  SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7
)
SELECT Device, Nr FROM raw t, 
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING((SELECT AS STRUCT * EXCEPT(Device) FROM UNNEST([t]))), r'":([^,}]*)')) Nr 
  

OP的评论更新:我完全忘记将列名也应添加为单独的列的要求

#standardSQL
SELECT Device, SPLIT(pair, ':')[OFFSET(0)] AS col, SPLIT(pair, ':')[OFFSET(1)] AS Nr 
FROM raw t, 
UNNEST(SPLIT(REGEXP_REPLACE(TO_JSON_STRING((SELECT AS STRUCT * EXCEPT(Device) FROM UNNEST([t]))), r'["{}]', ''))) pair  

如果要应用于相同的采样数据,结果如下所示

Row Device  col     Nr   
1   Desktop col1    24   
2   Desktop col2    9    
3   Desktop col3    28   
4   Desktop col4    7    
5   Desktop col5    98   
6   Desktop col6    77   
7   Desktop col7    59   
8   Mobile  col1    8    
9   Mobile  col2    43   
10  Mobile  col3    75   
11  Mobile  col4    44   
12  Mobile  col5    38   
13  Mobile  col6    31   
14  Mobile  col7    46   
15  Tablet  col1    7    
16  Tablet  col2    9    
17  Tablet  col3    34   
18  Tablet  col4    86   
19  Tablet  col5    62   
20  Tablet  col6    69   
21  Tablet  col7    74