使用在满足条件后重置的聚合函数?

时间:2015-07-03 10:49:27

标签: google-bigquery

我正在处理事件数据,目前正在尝试通过总计当前时间戳和以前时间戳的差异来计算应用程序中花费的时间。但问题是,每次“packageName”列的值发生变化时,我都需要重置此值。我试过使用以下内容。

SELECT    
    SUM(timeDifference) OVER(PARTITION BY packageName ORDER BY sNumber, timestamp) as accTime,
    *
FROM table.name
ORDER BY
    sNumber, timestamp

然而,结果似乎太聪明了。我需要它在每个分区之后忘记它的聚合,而不是记住早期的​​结果并累积它们。

我的问题是,是否有任何方法可以重置此功能。我将举例说明我所得到的以及我想要的输出是什么。任何帮助将不胜感激。

我得到了什么。

**accTime      diff         packageName**
10              10          com.package.1
20              20          com.package.1
10              10          com.package.2
20              20          com.package.2
30              10          com.package.1

我想要的。

**accTime      diff         packageName**
10              10          com.package.1
20              20          com.package.1
10              10          com.package.2
20              20          com.package.2
10              10          com.package.1

第二个例子显示“first”的累计时间被重置,这是我需要帮助的。

为了帮助我自己进一步解释,这里是原始数据的样本:

**timestamp          packageName          sNumber      eventID      diff**
  1433119125117      com.package.1        xx123xx      event1       null
  1433119125200      com.package.1        xx123xx      event2         83
  1433119125400      com.package.2        xx123xx      event3        200
  1433119125600      com.package.2        xx123xx      event4        200
  1433119125800      com.package.1        xx123xx      event5        200

2 个答案:

答案 0 :(得分:1)

使用滞后功能(你会发现我的答案看起来像奔腾),我认为这就是你想要的......

我不是100%肯定,因为你的accTime似乎从你的差异中表现得很奇怪...对我来说,accTime应该是accTime + diff,不是吗? (如果我错了,请纠正我,现在查询的位置,很容易调整它:))

SELECT
  timestamp,package,sNumber,eventID,diff,
  CASE WHEN lagPackage IS NULL then 0
  WHEN package != lagPackage THEN diff 
  ELSE (diff + IF(lagDiff is null, 0,lagDiff)) END AS accTime
FROM (
  SELECT
    *,
    LAG(package,1) OVER (ORDER BY timestamp) AS lagPackage,
    LAG(diff,1,0) OVER (ORDER BY timestamp) AS lagDiff
  FROM (
    SELECT
      1433119125117 AS timestamp,
      'com.package.1' AS package,
      'xxx123xxx' AS sNumber,
      'event1' AS eventID,
      NULL AS diff),
    (
    SELECT
      1433119125200 AS timestamp,
      'com.package.1' AS package,
      'xxx123xxx' AS sNumber,
      'event2' AS eventID,
      83 AS diff),
    (
    SELECT
      1433119125400 AS timestamp,
      'com.package.2' AS package,
      'xxx123xxx' AS sNumber,
      'event3' AS eventID,
      200 AS diff),
    (
    SELECT
      1433119125600 AS timestamp,
      'com.package.2' AS package,
      'xxx123xxx' AS sNumber,
      'event4' AS eventID,
      200 AS diff),
    (
    SELECT
      1433119125800 AS timestamp,
      'com.package.1' AS package,
      'xxx123xxx' AS sNumber,
      'event5' AS eventID,
      200 AS diff),
  ORDER BY
    timestamp )

从您提供的样本集中,返回:

Row timestamp       package         sNumber  eventID    diff    accTime  
1   1433119125117   com.package.1   xxx123xxx   event1  null    0    
2   1433119125200   com.package.1   xxx123xxx   event2  83      83   
3   1433119125400   com.package.2   xxx123xxx   event3  200     200  
4   1433119125600   com.package.2   xxx123xxx   event4  200     400  
5   1433119125800   com.package.1   xxx123xxx   event5  200     200  

答案 1 :(得分:0)

同时正在玩一些样品。这不是一个完整的答案,但可能有助于某人。

select 
  pos,label,diff,
  if (lag!=label or lag is null,1,0) as reset
from(
  select 
    pos,label,diff,
    LAG(label, 1) OVER (ORDER BY pos asc) lag,
  from (select 10 as diff,'first' as label, 1 as pos),
    (select 20 as diff,'first' as label, 2 as pos),
    (select 10 as diff,'second' as label, 3 as pos),
    (select 20 as diff,'second' as label, 4 as pos),
    (select 10 as diff,'first' as label, 5 as pos),
    (select 11 as diff,'first' as label, 6 as pos),
    (select 12 as diff,'first' as label, 7 as pos),
  order by pos
)

返回

+-----+-----+--------+------+-------+---+
| Row | pos | label  | diff | reset |   |
+-----+-----+--------+------+-------+---+
|   1 |   1 | first  |   10 |     1 |   |
|   2 |   2 | first  |   20 |     0 |   |
|   3 |   3 | second |   10 |     1 |   |
|   4 |   4 | second |   20 |     0 |   |
|   5 |   5 | first  |   10 |     1 |   |
|   6 |   6 | first  |   11 |     0 |   |
|   7 |   7 | first  |   12 |     0 |   |
+-----+-----+--------+------+-------+---+