我有一张包含以下数据的表格。如果我们与" _"分开,试图提取第二个字段。它应该包含[数字 - 数字|数字 - 数字]。尝试使用regexp_extract,但无法获得所需的结果。
请建议如何实现这一目标。
数据:
output
D22_022-010|022-009_84233|669250 345 022-010 172.5
D22_022-010|022-009_666249|843250 22 022-009 172.5
D28I_28-04_5042|44182_250 235 022-010 11
D22_022-010|022-009_8423250 232 022-009 11
D23_23-06_NA_FW27_D23_600 22 28-04 235
D21_21-08_NA_FW14_D21_50 56 022-010 116
D23_23-06_NA_FW27_D23_90 88 022-009 116
D21_21-08_NA_FW14_D21_50 99 23-06 22
G | TR | Search : 56021 89 21-08 56
Free Sprayer_1x1(3.30) 77 23-06 88
Click Tracker (5.4) 33 23-06 99
6.1 FW18_D28o_Click 4 21-08 89
null 77
null 33
null 4
答案 0 :(得分:1)
以下是BigQuery Standard SQL
假设您的列位于ad
且value
以下,则应按照您的要求进行操作
#standardSQL
SELECT item, ROUND(IFNULL(value / ARRAY_LENGTH(items), value)) AS split_value
FROM (
SELECT value,
SPLIT(REGEXP_EXTRACT(ad, '_((?:[0-9]+-[0-9]+)(?:\\|(?:[0-9]+-[0-9]+))*)'),'|') AS items
FROM `yourProject.yourDataset.yourTable`
) LEFT JOIN UNNEST(items) AS item
您可以使用您问题中的以下虚拟数据进行测试
#standardSQL
WITH `yourTable` AS (
SELECT 'D22_022-010|022-009_84233|669250' AS ad, 345 AS value UNION ALL
SELECT 'D22_022-010|022-009_666249|843250', 22 UNION ALL
SELECT 'D28I_28-04_5042|44182_250', 235 UNION ALL
SELECT 'D22_022-010|022-009_8423250', 232 UNION ALL
SELECT 'D23_23-06_NA_FW27_D23_600', 22 UNION ALL
SELECT 'D21_21-08_NA_FW14_D21_50', 56 UNION ALL
SELECT 'D23_23-06_NA_FW27_D23_90', 88 UNION ALL
SELECT 'D21_21-08_NA_FW14_D21_50', 99 UNION ALL
SELECT 'G | TR | Search : 56021', 89 UNION ALL
SELECT 'Free Sprayer_1x1(3.30)', 77 UNION ALL
SELECT 'Click Tracker (5.4)', 33 UNION ALL
SELECT '6.1 FW18_D28o_Click', 4
)
SELECT item, ROUND(IFNULL(value / ARRAY_LENGTH(items), value)) AS split_value
FROM (
SELECT value,
SPLIT(REGEXP_EXTRACT(ad, '_((?:[0-9]+-[0-9]+)(?:\\|(?:[0-9]+-[0-9]+))*)'),'|') AS items
FROM `yourTable`
) LEFT JOIN UNNEST(items) AS item
结果是(正如您所料)
item split_value
------- -----------
022-010 173.0
022-009 173.0
022-010 11.0
022-009 11.0
28-04 235.0
022-010 116.0
022-009 116.0
23-06 22.0
21-08 56.0
23-06 88.0
21-08 99.0
null 89.0
null 77.0
null 33.0
null 4.0