我想将空格分隔的字符串拆分为5并为每个字符串创建列,但我发现难以生成所需的输出。 编辑:使用标准SQL方言
示例数据:
Row published_at data_string device id
1 2016-10-26T22:53:03.209Z 70.77 3.38 61.65 7.98 73.20 3.29 63.55 nan nan nan nan 2a0025000351353337353037
...
1 of 570 rows
期望的输出:
Row published_at battery temp1 humid1 temp2 humid2 temp3 humid3 device_id
1 2016-11-03T16:24:09.833Z 70.77 3.38 61.65 7.98 73.20 3.29 63.55 2a0025000351353337353037
1 of 570 rows
尝试查询1.a:
WITH
h2a0025_2 AS (
SELECT
TIMESTAMP '2016-10-26T22:53:03.209Z' AS published_at,
'70.77 3.38 61.65 7.98 73.20 3.29 63.55 nan nan nan nan' AS data_string,
'2a0025000351353337353037' AS device_id
UNION ALL
SELECT
TIMESTAMP '2016-10-26T22:53:03.209Z',
'70.77 3.38 61.65 7.98 73.20 3.29 63.55 nan nan nan nan',
'2a0025000351353337353037' )
SELECT
published_at,
parts[OFFSET(0)] AS Battery,
parts[OFFSET(1)] AS Temp1,
parts[OFFSET(1)] AS Humid1,
parts[OFFSET(2)] AS Temp2,
parts[OFFSET(3)] AS Humid2,
parts[OFFSET(4)] AS Temp3,
parts[OFFSET(5)] AS Humid3,
device_id
FROM (
SELECT
* EXCEPT(data_string),
SPLIT(data_string, ' ') AS parts
FROM
`h2a0025_2`);
结果1.a:2个相同的行
Row published_at battery temp1 humid1 temp2 humid2 temp3 humid3 device_id
1 2016-11-03T16:24:09.833Z 70.77 3.38 61.65 7.98 73.20 3.29 63.55 2a0025000351353337353037
2 2016-11-03T16:24:09.833Z 70.77 3.38 61.65 7.98 73.20 3.29 63.55 2a0025000351353337353037
2 of 2 rows
尝试2:
SELECT
published_at,
parts[OFFSET(0)] AS Battery,
parts[OFFSET(1)] AS Temp1,
parts[OFFSET(1)] AS Humid1,
parts[OFFSET(2)] AS Temp2,
parts[OFFSET(3)] AS Humid2,
parts[OFFSET(4)] AS Temp3,
parts[OFFSET(5)] AS Humid3,
device_id
FROM (
SELECT
* EXCEPT(data_string),
SPLIT(data_string, ' ') AS parts
FROM
`myproject.mydataset.h2a0025_2`);
结果: 查询失败 错误:数组索引3超出范围(溢出)
答案 0 :(得分:2)
这是一个让你入门的例子。不要试图获得正确的子字符串位置,而是使用SPLIT
函数,然后在结果数组中选择所需的偏移量。
#standardSQL
WITH YourTable AS (
SELECT
TIMESTAMP '2016-11-03T16:24:09.833Z' AS published_at,
'80.91 22.15 45.35 14.41 64.54' AS data_string
UNION ALL
SELECT
TIMESTAMP '2016-11-04T18:34:08.143Z',
'75.37 28.43 31.17 34.80 19.33'
)
SELECT
published_at,
parts[OFFSET(0)] AS Temp1,
parts[OFFSET(1)] AS Humid1,
parts[OFFSET(2)] AS Temp2,
parts[OFFSET(3)] AS Humid2
FROM (
SELECT
* EXCEPT(data_string),
SPLIT(data_string, ' ') AS parts
FROM YourTable
);
要使用真实表进行测试 - 请仅使用以下部分脚本 -
#standardSQL
SELECT
published_at,
parts[OFFSET(0)] AS Temp1,
parts[OFFSET(1)] AS Humid1,
parts[OFFSET(2)] AS Temp2,
parts[OFFSET(3)] AS Humid2
FROM (
SELECT
* EXCEPT(data_string),
SPLIT(data_string, ' ') AS parts
FROM `yourproject.yourdataset.yourtable`
);