在BigQuery中拆分字符串列

时间:2013-10-16 21:06:45

标签: google-bigquery

假设我在BigQuery中有一个包含2列的表。第一列表示名称,第二列是任意长度的分隔值列表。例如:

Name | Scores
-----+-------
Bob  |10;20;20
Sue  |14;12;19;90
Joe  |30;15

我想转换为第一个是名称的列,第二个是单个得分值,如下所示:

Name,Score
Bob,10
Bob,20
Bob,20
Sue,14
Sue,12
Sue,19
Sue,90
Joe,30
Joe,15

这可以单独在BigQuery中完成吗?

3 个答案:

答案 0 :(得分:9)

大家好消息! BigQuery现在可以SPLIT()!


请看“find all two word phrases that appear in more than one row in a dataset”。

目前没有分割()BigQuery中的值来从字符串生成多行的方法,但您可以使用正则表达式查找逗号并查找第一个值。然后运行类似的查询以查找第二个值,依此类推。它们都可以使用上例中显示的模式(UNION通过逗号)合并为一个查询。

答案 1 :(得分:9)

如果有人还在寻找答案

select Name,split(Scores,';') as Score
from (
      # replace the inner custome select with your source table
      select *
      from 
      (select 'Bob' as Name ,'10;20;20' as Scores),
      (select 'Sue' as Name ,'14;12;19;90' as Scores),
      (select 'Joe' as Name ,'30;15' as Scores)
);

答案 2 :(得分:2)

尝试在Standart SQL中重写Elad Ben Akoune's answer,查询变成这样;

WITH name_score AS (
SELECT Name, split(Scores,';') AS Score
FROM (
      (SELECT * FROM (SELECT 'Bob' AS Name ,'10;20;20' AS Scores)) 
      UNION ALL 
      (SELECT * FROM (SELECT 'Sue' AS Name ,'14;12;19;90' AS Scores))
      UNION ALL
      (SELECT * FROM (SELECT 'Joe' AS Name ,'30;15' AS Scores))
)) 
SELECT name, score
FROM name_score
CROSS JOIN UNNEST(name_score.score) AS score;

然后输出;

+------+-------+
| name | score |
+------+-------+
| Bob  | 10    |
| Bob  | 20    |
| Bob  | 20    |
| Sue  | 14    |
| Sue  | 12    |
| Sue  | 19    |
| Sue  | 90    |
| Joe  | 30    |
| Joe  | 15    |
+------+-------+