Bigquery - 子句之间的行?

时间:2017-02-09 03:45:25

标签: sql google-bigquery

还没有找到使用Bigquery的解决方案:比如我有一张桌子(称之为TripSpeed):

DeviceId | TripId | Speed | DateTime 
  5           1      0                 
  5           1      8                 
  5           1      12                
  5           1       0                               
  5           1       2                
  5           2       ..................
  5           2       .................
  6           ..........................

我想将它重新组织(加重)成为:

DeviceId | TripId | Speed | DateTime
  5           1      0,8,12             
  5           1      0,2                        
  5           2       ....................
  5           2       ...................
  6           ............................

更多说明:

  1. 按DeviceId和TripId分组数据

  2. DateTime对于每一行都是唯一的,精确到毫秒,并且数据需要按每个组中的日期时间排序

  3. 在同一组中,speed = 0的行是每个段的启动器
  4. 我已经做过其他清理工作,因此不会有连续的零

2 个答案:

答案 0 :(得分:1)

您需要分配一个组然后再分配。任务很简单。它是零的累积值。其余的是聚合。但是,这假设您有一个列来指定行的顺序。我认为那是select deviceid, tripid, group_concat(speed) from (select t.*, sum(case when speed = 0 then 1 else 0 end) over (partition by deviceid, tripid order by datetime) as grp from t ) t group by deviceid, tripid, grp

app.service('MyUser', [function($scope) {
  this.loggedIn = false;
  return {
    getStatus: function() {
       //call fb api
       this.loggedIn = true;
       return this.loggedIn;
    }
  }
}]);

答案 1 :(得分:1)

对于BigQuery Standard SQL

  
#standardSQL
WITH TripSpeed AS (
  SELECT 5 AS DeviceId, 1 AS TripId, 0 AS Speed, 1 AS DateTime UNION ALL                 
  SELECT 5, 1, 8, 2 UNION ALL                 
  SELECT 5, 1, 12, 3 UNION ALL                
  SELECT 5, 1, 0, 4 UNION ALL                               
  SELECT 5, 1, 2, 5 UNION ALL                
  SELECT 5, 2, 0, 6 UNION ALL
  SELECT 5, 2, 1, 7 UNION ALL
  SELECT 6, 3, 0, 8 
)
SELECT DeviceId, TripId, STRING_AGG(CAST(Speed AS STRING)) AS Speed, Segment
FROM (
  SELECT DeviceId, TripId, Speed,
    COUNTIF(Speed = 0) OVER (PARTITION BY DeviceId, TripId ORDER BY DateTime) AS Segment
  FROM TripSpeed 
) 
GROUP BY DeviceId, TripId, Segment
-- ORDER BY DeviceId, TripId, Segment

另一个版本没有分析功能,而是字符串处理
不知何故,我觉得它可以比上面的版本便宜

#standardSQL
SELECT DeviceId, TripId, Speed
FROM (
  SELECT DeviceId, TripId, 
    STRING_AGG(
      CONCAT(IF(Speed = 0, '|', ','), CAST(Speed AS STRING)), 
      '' ORDER BY DateTime) AS Speed
  FROM TripSpeed 
  GROUP BY DeviceId, TripId
), UNNEST(SPLIT(Speed, '|'))  AS Speed
WHERE Speed <> ''
-- ORDER BY DeviceId, TripId  

您可以使用相同的虚拟样本数据

进行测试
WITH TripSpeed AS (
  SELECT 5 AS DeviceId, 1 AS TripId, 0 AS Speed, 1 AS DateTime UNION ALL                 
  SELECT 5, 1, 8, 2 UNION ALL                 
  SELECT 5, 1, 12, 3 UNION ALL                
  SELECT 5, 1, 0, 4 UNION ALL                               
  SELECT 5, 1, 2, 5 UNION ALL                
  SELECT 5, 2, 0, 6 UNION ALL
  SELECT 5, 2, 1, 7 UNION ALL
  SELECT 6, 3, 0, 8 
)