拥有此BigQuery
select JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.clone_url") AS clone_url, JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.language") AS language, integer(JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.stargazers_count")) as stars from githubarchive:day.20200115 where JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.language")="C" group by language,clone_url,stars order by stars DESC limit 1000;
它返回带有唯一“星号”的“ clone_url”条目。
我如何只显示clone_url唯一的最高星号计数?
可以优化此查询吗?
以下是查询结果:
谢谢
答案 0 :(得分:2)
您似乎仍在使用BigQuery旧版SQL-以下是旧版SQL
#legacySQL
SELECT
JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.clone_url") AS clone_url,
JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.language") AS language,
MAX(INTEGER(JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.stargazers_count"))) AS stars
FROM [githubarchive:day.20200115]
WHERE JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.language")="C"
GROUP BY language, clone_url
ORDER BY stars DESC
LIMIT 1000
请注意:强烈建议您迁移到BigQuery Standard SQL,因此上面的外观如下
#standardSQL
SELECT
JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.clone_url") AS clone_url,
JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.language") AS language,
MAX(CAST(JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.stargazers_count") AS INT64)) AS stars
FROM `githubarchive.day.20200115`
WHERE JSON_EXTRACT_SCALAR(payload, "$.pull_request.base.repo.language")="C"
GROUP BY language, clone_url
ORDER BY stars DESC
LIMIT 1000
以上两个查询都将返回以下内容