BigQuery中的条件递增

时间:2016-03-01 06:40:46

标签: google-bigquery

我有一个这样的数据表:

// In my-generator/generators/turbo/index.js
module.exports = require('yeoman-generator').Base.extend({
  'prompting' : function () {
    console.log('prompting - turbo');
  },

  'writing' : function () {
    console.log('prompting - turbo');
  }
});

// In my-generator/generators/electric/index.js
module.exports = require('yeoman-generator').Base.extend({
  'prompting' : function () {
    console.log('prompting - zap');
  },

  'writing' : function () {
    console.log('writing - zap');
  }
});

// In my-generator/generators/app/index.js
module.exports = require('yeoman-generator').Base.extend({
  'initializing' : function () {
    this.composeWith('my-generator:turbo');
    this.composeWith('my-generator:electric');
  }
});

现在,我正在尝试定义一个' session_id'对于基于event_time的用户。如果事件在180秒之后出现,则事件被视为来自新会话。所以,我想输出类似于:

的输出
user_id  event_time
1        1456812346
1        1456812350
1        1456812446
1        1456812950
1        1456812960

会话在第4行增加,因为时间是第3行之后的504秒,因此超过180秒的阈值。

在Mysql中,我可以声明一个变量,然后有条件地增加它。由于BigQuery不支持变量创建,是否有另一种方法可以实现此目的?

1 个答案:

答案 0 :(得分:1)

SELECT 
  user_id, event_time, session_id
FROM (
  SELECT 
    user_id, event_time, event_time - last_time > 180 AS new_session, 
    SUM(IFNULL(new_session, 1)) 
        OVER(PARTITION BY user_id ORDER BY event_time) AS session_id
  FROM (
    SELECT user_id, event_time,
      LAG(event_time) OVER(PARTITION BY user_id ORDER BY event_time) AS last_time
    FROM YourTable
  )
)
ORDER BY event_time