我想使用“ WHERE xxx IN('aaa','bbb')”执行以下标准SQL。
但是发生了错误。 可以在STANDARD SQL中使用'IN'吗?
标准SQL
SELECT commit, author.name, committer.name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
WHERE repo_name IN
('tensorflow/tensorflow', 'facebook/react')
错误
No matching signature for operator IN for argument types ARRAY<STRING> and {STRING} at [5:17]
下面的旧版SQL似乎可以执行。
旧版SQL
SELECT commit, author.name, committer.name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM [bigquery-public-data:github_repos.commits]
WHERE repo_name IN
('tensorflow/tensorflow', 'facebook/react')
答案 0 :(得分:3)
您无法在Standard SQL中真正做到这一点。但是在本文How to use the UNNEST function in BigQuery to analyze event parameters in Analytics中,他们对此进行了很好的解释。
我认为这是您获得所需结果的方式:
SELECT commit, author.name as author_name, committer.name as committer_name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
CROSS JOIN UNNEST(repo_name) as repo_names_unnested
WHERE repo_names_unnested IN ('tensorflow/tensorflow', 'facebook/react')
请注意,您不能同时拥有author.name
和committer.name
,因为二者将同时显示为name
。因此,我将它们更改为author_name
和committer_name
。
我还认为您实际上正在寻找的是将repo_name
替换为repo_names_unnested
的结果,因此请尝试在SELECT子句中也替换它。
答案 1 :(得分:2)
以下是用于BigQuery标准SQL
如果您需要保留原始记录并仅在repo_name中输出带有'tensorflow / tensorflow','facebook / react'的记录,则为
#standardSQL
SELECT commit, author.name AS author_name, committer.name AS committer_name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
WHERE EXISTS (
SELECT 1 FROM UNNEST(repo_name) name WHERE name IN ('tensorflow/tensorflow', 'facebook/react')
)
答案 2 :(得分:0)
字段“ repo_name”是REPEATED(传统SQL)或ARRAY(标准SQL),IN运算符不支持标准SQL。要将数组转换为一组行,您可以使用UNNEST operator。