从字符串中提取键世界,并在另一列中显示

时间:2018-07-24 21:48:40

标签: google-bigquery

我需要编写一个查询以从String中提取特定名称,并在另一列中显示它们,例如,具有此字段的列

列: 第1行:jasdhj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj Row 2 :jasnasc8918212,ahsahkdjjMina67, 第3行:kasdhakshd,asda,asdasd,121,121,Sina878788kasas

关键字:阿明,米娜,新浪

如何将这些关键字放在另一列中?我不想插入另一列,但是如果这是唯一的解决方案,请告诉我。
任何帮助表示赞赏!

2 个答案:

答案 0 :(得分:2)

以下是用于BigQuery标准SQL

   
#standardSQL
WITH keywords AS (
  SELECT keyword
  FROM UNNEST(SPLIT('Amin,Mina,Sina')) keyword
)
SELECT str, STRING_AGG(keyword) keywords_in_str
FROM `project.dataset.table`
CROSS JOIN keywords
WHERE REGEXP_CONTAINS(str, CONCAT(r'(?i)', keyword))
GROUP BY str 

您可以使用下面的问题中的虚拟数据进行测试,操作

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'jasdhMINAj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj' str UNION ALL
  SELECT 'jasnasc8918212,ahsahkdjjMina67,' UNION ALL
  SELECT 'kasdhakshd,asda,asdasd,121,121,Sina878788kasas' 
), keywords AS (
  SELECT keyword
  FROM UNNEST(SPLIT('Amin,Mina,Sina')) keyword
)
SELECT str, STRING_AGG(keyword) keywords_in_str
FROM `project.dataset.table`
CROSS JOIN keywords
WHERE REGEXP_CONTAINS(str, CONCAT(r'(?i)', keyword))
GROUP BY str 

结果为

Row str                                                     keywords_in_str  
1   jasdhMINAj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj    Amin,Mina    
2   jasnasc8918212,ahsahkdjjMina67,                         Mina     
3   kasdhakshd,asda,asdasd,121,121,Sina878788kasas          Sina     

答案 1 :(得分:0)

计算关键字数量

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'jasdhMINAj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj' str UNION ALL
  SELECT 'jasnasc8918212,ahsahkdjjMina67,' UNION ALL
  SELECT 'kasdhakshd,asda,asdasd,121,121,Sina878788kasas' 
)
select str,array(select as struct countif(lower(x) ="amin") amin,countif(lower(x) ="mina") mina,countif(lower(x)="sina") sina from unnest(x)x)keyword from
(select str,regexp_extract_all(str,"(?i)(Amin|Mina|Sina)")x from `project.dataset.table`)