我目前正在尝试计算google big query中行之间的时间戳差异附加的是我用来测试代码的示例表。
我正在使用此代码
SELECT
A.row,
A.issue.updated_at,
(B.issue.updated_at - A.issue.updated_at) AS timedifference
FROM [icxmedia-servers:icx_metrics.gh_zh_data_production] A
INNER JOIN [icxmedia-servers:icx_metrics.gh_zh_data_production] B
ON B.row = (A.row + 1)
WHERE issue.number==6 and issue.name=="archer"
ORDER BY A.requestid ASC
答案 0 :(得分:2)
而不是JOIN
,这更自然地使用分析函数表达。 analytic functions with standard SQL in BigQuery的文档解释了分析函数的工作原理以及语法。例如,如果您希望在列x
确定订单的y
值中产生连续差异,则可以执行以下操作:
WITH T AS (
SELECT
x,
y
FROM UNNEST([9, 3, 4, 7]) AS x WITH OFFSET y)
SELECT
x,
x - LAG(x) OVER (ORDER BY y) AS x_diff
FROM T;
请注意,要在BigQuery中运行此功能,您需要取消选中"使用旧版SQL"框下"显示选项"启用标准SQL。 WITH T
子句只是为示例设置了一些数据。
对于您的具体情况,您可能需要查询,例如:
SELECT
row,
issue.updated_at,
issue.updated_at - LAG(issue.updated_at) OVER (ORDER BY issue.updated_at) AS timedifference
FROM `icxmedia-servers.icx_metrics.gh_zh_data_production`
WHERE issue.number = 6
AND issue.name = "archer"
ORDER BY requestid ASC;
如果您想确定updated_at
仅在一个问题编号之外的差异,您也可以使用PARTITION BY
子句。例如:
SELECT
row,
issue.name,
issue.number,
issue.updated_at,
issue.updated_at - LAG(issue.updated_at) OVER (
PARTITION BY issue.number
ORDER BY issue.updated_at) AS timedifference
FROM `icxmedia-servers.icx_metrics.gh_zh_data_production`
ORDER BY requestid ASC;