说我有这张桌子,
| eta | arrived | time_diff |
+-------+----------+-----------+
| 06:47 | 06:47 | 0 |
| 08:30 | 08:40 | 10 |
| 10:30 | 10:40 | 10 |
| 10:30 | 10:31 | 1 |
+-------+----------+-----------+
and i got the time_diff by TIME_DIFF(arrived , eta , MINUTE) as time_diff
我想做的是能够计算出我有0、10 ...个。 理想情况下,上表将产生1 0、2 10和11。Offcorse我不预先知道time_diff结果只是想计算结果发生的次数,说我可能有2,3,5 ... 如何在BigQuery标准SQL中完成此操作?
答案 0 :(得分:1)
您应该使用group by子句
Select time_diff , Count(*)
From [table]
Group by time_diff
答案 1 :(得分:1)
以下是用于BigQuery标准SQL
从实际的角度来看,我建议按以下示例对箱进行分组:0-9、10-19、20-29,依此类推
#standardSQL
WITH `project.dataset.table` AS (
SELECT '06:47' eta, '06:47' arrived UNION ALL
SELECT '08:30', '08:40' UNION ALL
SELECT '10:30', '10:40' UNION ALL
SELECT '10:30', '10:31'
)
SELECT FORMAT('%i - %i', bin, bin + 9) bin, cnt
FROM (
SELECT
10 * DIV(TIME_DIFF(PARSE_TIME('%R', arrived) , PARSE_TIME('%R', eta) , MINUTE), 10) bin,
COUNT(1) cnt
FROM `project.dataset.table`
GROUP BY bin
)
ORDER BY bin
有结果
Row bin cnt
1 0 - 9 2
2 10 - 19 2
如果您需要每个time_diff的精确分布,请在下面使用
#standardSQL
WITH `project.dataset.table` AS (
SELECT '06:47' eta, '06:47' arrived UNION ALL
SELECT '08:30', '08:40' UNION ALL
SELECT '10:30', '10:40' UNION ALL
SELECT '10:30', '10:31'
)
SELECT
TIME_DIFF(PARSE_TIME('%R', arrived) , PARSE_TIME('%R', eta) , MINUTE) diff,
COUNT(1) cnt
FROM `project.dataset.table`
GROUP BY diff
ORDER BY diff
结果为
Row diff cnt
1 0 1
2 1 1
3 10 2