查找行之间的时差

时间:2019-09-23 10:49:09

标签: sql google-bigquery dremel

我需要按用户计算IssueID的声明时间  总索赔时间是从状态索赔到最近等待的时间。看起来有点复杂  请帮忙。

 IssueID    TransTime   User    Status
101 2019-08-23 0:25:41  Peter   CLAIMED
101 2019-08-23 0:25:44  Peter   CLAIMED
101 2019-08-23 0:26:12  Peter   WAITING
101 2019-08-23 20:14:13 Peter   CLAIMED
101 2019-08-23 20:14:16 Peter   CLAIMED
101 2019-08-23 20:14:52 Peter   WAITING
102 2019-08-24 8:59:19  Miller  CLAIMED
102 2019-08-24 8:59:56  Miller  CLAIMED
102 2019-08-24 9:00:09  Miller  WAITING
102 2019-08-24 9:00:17  Miller  CLAIMED
102 2019-08-24 9:00:20  Miller  CLAIMED
102 2019-08-25 21:56:52 Miller  WAITING`

例如,彼得的总索赔时间从'2019-08-23 0:25:41'开始,直到第一个等待时间'2019-08-23 0:26:12',然后从'2019-08- 23 20:14:13'至'2019-08-23 20:14:52'。所有这些时间差加起来是peter要求的总时间,第一时间约为31秒,第二时间约为39秒。大约70秒。

预先感谢

`

2 个答案:

答案 0 :(得分:0)

您可以通过计算每行之后“等待” 的数量来标识每个组。然后使用此信息获取每个索赔期。所以:

select issueId,
       min(transTime) as min_time,
       max(transTime) as max_time),
       datetime_diff(min(transTime), max(transTime), second) as time_in_seconds
from (select t.*,
             countif(status = 'WAITING') over (partition by issueId order by transTime desc) as grp
      from t
      where status in ('WAITING', 'CLAIM')
     ) t
group by issueId, grp;

我不确定是否正是您想要的-您可能需要附加级别的聚合。但这是计算每个周期的想法。

答案 1 :(得分:0)

以下是用于BigQuery标准SQL

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 101 IssueID, TIMESTAMP '2019-08-23 0:25:41' TransTime, 'Peter' User, 'CLAIMED' Status UNION ALL
  SELECT 101, '2019-08-23 0:25:44', 'Peter', 'CLAIMED' UNION ALL
  SELECT 101, '2019-08-23 0:26:12', 'Peter', 'WAITING' UNION ALL
  SELECT 101, '2019-08-23 20:14:13', 'Peter', 'CLAIMED' UNION ALL
  SELECT 101, '2019-08-23 20:14:16', 'Peter', 'CLAIMED' UNION ALL
  SELECT 101, '2019-08-23 20:14:52', 'Peter', 'WAITING' UNION ALL
  SELECT 102, '2019-08-24 8:59:19', 'Miller', 'CLAIMED' UNION ALL
  SELECT 102, '2019-08-24 8:59:56', 'Miller', 'CLAIMED' UNION ALL
  SELECT 102, '2019-08-24 9:00:09', 'Miller', 'WAITING' UNION ALL
  SELECT 102, '2019-08-24 9:00:17', 'Miller', 'CLAIMED' UNION ALL
  SELECT 102, '2019-08-24 9:00:20', 'Miller', 'CLAIMED' UNION ALL
  SELECT 102, '2019-08-25 21:56:52', 'Miller', 'WAITING' 
)
SELECT IssueID, SUM(waiting_time) total_waiting_time 
FROM (
  SELECT IssueID, TIMESTAMP_DIFF(MAX(TransTime), MIN(TransTime), SECOND) waiting_time
  FROM (
    SELECT *, COUNTIF(start) OVER(PARTITION BY IssueID ORDER BY TransTime) waiting
    FROM (
      SELECT *, ('CLAIMED' = status AND IFNULL(LAG(status) OVER(PARTITION BY IssueID ORDER BY TransTime), 'WAITING') = 'WAITING') start
      FROM `project.dataset.table`
      WHERE status IN ('CLAIMED', 'WAITING')
    )
  )
  GROUP BY IssueID, waiting
)
GROUP BY IssueID
ORDER BY IssueID

有结果

Row IssueID total_waiting_time   
1   101     70   
2   102     133045