BigQuery - 以最小值忽略记录

时间:2017-04-28 21:45:47

标签: google-bigquery

我想删除字段中具有最小值的记录。以下是我的数据示例:

ID       EVENT            CREATED_TIME  
1        login_event      2017-03-13 13:44:21 UTC    
2        login_event      2017-03-13 13:45:46 UTC    
3        login_event      2017-03-16 15:08:24 UTC    
4        login_event      2017-04-21 20:08:44 UTC    
5        login_event      2017-03-16 15:08:59 UTC    
6        login_event      2017-04-21 20:09:25 UTC    
7        login_event      2017-04-21 20:11:46 UTC

我想运行查询并删除最早的CREATED_TIME记录。下面是所需的输出:

ID       EVENT            CREATED_TIME  
2        login_event      2017-03-13 13:45:46 UTC    
3        login_event      2017-03-16 15:08:24 UTC    
4        login_event      2017-04-21 20:08:44 UTC    
5        login_event      2017-03-16 15:08:59 UTC    
6        login_event      2017-04-21 20:09:25 UTC    
7        login_event      2017-04-21 20:11:46 UTC

我在BigQuery文档中找到了OMIT RECORD IF子句,但我无法让它真正起作用。我知道我可以使用RANK,PARTITION和WHERE rank的组合来做到这一点!= 1.但是,我觉得应该有更直观的方法来实现这个目标(例如通过OMIT RECORD IF)。

谢谢!

1 个答案:

答案 0 :(得分:1)

以下是BigQuery Standard SQL

  
#standardSQL
SELECT * EXCEPT(start) 
FROM (
  SELECT *, 
    CREATED_TIME = MIN(CREATED_TIME) OVER(PARTITION BY EVENT) AS start
  FROM yourTable
)
WHERE NOT start
-- ORDER BY CREATED_TIME

你可以尝试使用你问题中的虚拟数据进行测试

#standardSQL
WITH yourTable AS (
  SELECT 1 AS ID, 'login_event' AS EVENT, TIMESTAMP '2017-03-13 13:44:21 UTC' AS CREATED_TIME UNION ALL    
  SELECT 2, 'login_event', TIMESTAMP '2017-03-13 13:45:46 UTC' UNION ALL    
  SELECT 3, 'login_event', TIMESTAMP '2017-03-16 15:08:24 UTC' UNION ALL    
  SELECT 4, 'login_event', TIMESTAMP '2017-04-21 20:08:44 UTC' UNION ALL    
  SELECT 5, 'login_event', TIMESTAMP '2017-03-16 15:08:59 UTC' UNION ALL    
  SELECT 6, 'login_event', TIMESTAMP '2017-04-21 20:09:25 UTC' UNION ALL    
  SELECT 7, 'login_event', TIMESTAMP '2017-04-21 20:11:46 UTC' 
)
SELECT * EXCEPT(start) 
FROM (
  SELECT *, 
    CREATED_TIME = MIN(CREATED_TIME) OVER(PARTITION BY EVENT) AS start
  FROM yourTable
)
WHERE NOT start
ORDER BY CREATED_TIME