我有一个名为Employees的表,其列为:PersonID,Name,StartDate。我想计算1)最新员工和最老员工之间的天数差异,以及2)没有任何新员工的最长时间段(以天为单位)。我尝试使用DATEDIFF,但是日期在单列中,我不确定应该使用哪种其他方法。任何帮助将不胜感激
答案 0 :(得分:1)
以下是用于BigQuery标准SQL
#standardSQL
SELECT
SUM(days_before_next_hire) AS days_between_newest_and_oldest_employee,
MAX(days_before_next_hire) - 1 AS longest_period_without_new_hire
FROM (
SELECT
DATE_DIFF(
StartDate,
LAG(StartDate) OVER(ORDER BY StartDate),
DAY
) days_before_next_hire
FROM `project.dataset.your_table`
)
您可以像下面的示例一样使用虚拟数据来测试,玩游戏
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT DATE '2019-01-01' StartDate UNION ALL
SELECT '2019-01-03' StartDate UNION ALL
SELECT '2019-01-13' StartDate
)
SELECT
SUM(days_before_next_hire) AS days_between_newest_and_oldest_employee,
MAX(days_before_next_hire) - 1 AS longest_period_without_new_hire
FROM (
SELECT
DATE_DIFF(
StartDate,
LAG(StartDate) OVER(ORDER BY StartDate),
DAY
) days_before_next_hire
FROM `project.dataset.your_table`
)
有结果
Row days_between_newest_and_oldest_employee longest_period_without_new_hire
1 12 9
请注意在计算-1
时使用longest_period_without_new_hire
-实际取决于您是否要进行此调整,具体取决于您对计数差距的偏好
答案 1 :(得分:0)
1)最新记录和最早记录之间的天数差异
WITH table AS (
SELECT DATE(created_at) date, *
FROM `githubarchive.day.201901*`
WHERE _table_suffix<'2'
AND repo.name = 'google/bazel-common'
AND type='ForkEvent'
)
SELECT DATE_DIFF(MAX(date), MIN(date), DAY) max_minus_min
FROM table
2)最长的时间段(以天为单位),没有任何新记录
WITH table AS (
SELECT DATE(created_at) date, *
FROM `githubarchive.day.201901*`
WHERE _table_suffix<'2'
AND repo.name = 'google/bazel-common'
AND type='ForkEvent'
)
SELECT MAX(diff) max_diff
FROM (
SELECT DATE_DIFF(date, LAG(date) OVER(ORDER BY date), DAY) diff
FROM table
)