2列的字符串匹配

时间:2017-08-25 14:50:46

标签: sql google-bigquery

有没有办法比较BQ中的2列?

我尝试了以下内容:

SELECT
  T1.id,
  CASE
    WHEN REGEXP_CONTAINS(geo, countries) THEN TRUE
    ELSE FALSE
  END AS geo_match
FROM
  T1
LEFT JOIN
  T2
ON
  T1.id = T2.id

并收到以下错误:

No matching signature for function REGEXP_CONTAINS for argument types: STRING, ARRAY<STRING>. Supported signatures: REGEXP_CONTAINS(STRING, STRING); REGEXP_CONTAINS(BYTES, BYTES) at [4:10]

我也尝试了LIKE功能。从未工作过。

1 个答案:

答案 0 :(得分:3)

以下是BigQuery Standard SQL

基于错误消息,我假设geo是一个字符串而countries是一个重复的字符串(数组):

   
#standardSQL
SELECT
  T1.id, 
  (SELECT COUNT(1) FROM UNNEST(countries) AS country WHERE geo = country) > 0 AS geo_match
FROM T1 LEFT JOIN T2
ON T1.id = T2.id
ORDER BY id  

根据您的要求,您可以使用任何比较逻辑(LIKEREGEXP_CONTAINS等)而非简单

WHERE geo = country   

您可以使用虚拟数据进行/测试,如下所示

#standardSQL
WITH T1 AS (
  SELECT 1 AS id, 'US' AS geo UNION ALL
  SELECT 2, 'UK' UNION ALL
  SELECT 3, 'MX' UNION ALL
  SELECT 4, 'CA'
), 
T2 AS (
  SELECT 1 AS id, ['US', 'UK'] AS countries UNION ALL
  SELECT 2, ['MX', 'CA'] UNION ALL
  SELECT 3, ['MX', 'CA']
)
SELECT
  T1.id, 
  (SELECT COUNT(1) FROM UNNEST(countries) AS country WHERE geo = country) > 0 AS geo_match
FROM T1 LEFT JOIN T2
ON T1.id = T2.id
ORDER BY id