我目前正在尝试使用GitHub Archive和BigQuery获取拥有最多星数和少于100个提交的前100个Java存储库。你能否帮忙提出一个查询,以获得拥有最多星数的前100个存储库。
我获得的最终查询是:
SELECT repository_name
FROM [githubarchive:github.timeline]
WHERE repository_language = 'Java'
AND PARSE_UTC_USEC(repository_created_at) BETWEEN PARSE_UTC_USEC('1996-01-01 00:00:00') AND PARSE_UTC_USEC('2015-05-30 00:00:00')
GROUP BY repository_name
HAVING COUNT(*) < 100
ORDER BY COUNT(*) DESC
LIMIT 100
答案 0 :(得分:3)
我认为此查询对您有用。您的现有查询将无法运行,因为ORDER BY
子句引用了聚合计算。 ORDER BY
要求表达式引用字段。将COUNT
移动到SELECT
子句会修复该部分。
此外,如果您正在查找git提交的计数,您应该通过将AND payload_commit IS NOT NULL
添加到WHERE
子句来检查时间轴事件是否为提交!
SELECT
repository_name,
COUNT(1) AS CommitCount
FROM
[githubarchive:github.timeline]
WHERE
repository_language = 'Java'
AND PARSE_UTC_USEC(repository_created_at)
BETWEEN PARSE_UTC_USEC('1996-01-01 00:00:00')
AND PARSE_UTC_USEC('2015-05-30 00:00:00')
AND payload_commit IS NOT NULL
GROUP BY
repository_name
HAVING
CommitCount < 100
ORDER BY
CommitCount DESC
LIMIT
100