PostgreSQL计算连续会话

时间:2016-03-14 19:36:42

标签: postgresql calculated-columns

我有一个包含4列的非常大的表:1)成员的status属性已更改为: online,offline,game_lobby,load_screen 2)成员的status属性已更改为:online,offline,game_lobby和load_screen 3)成员的ID号和4)status属性更改的时间戳)。我想计算所有成员在线消费的平均时间,这是状态从在线状态变为离线状态的时间戳与状态从离线状态变为在线状态的时间戳之间的差异: sample dataset

使用上面链接的样本,计算的平均值将是(01/03/2016 15:32:05 - 01/02/2016 07:18:32 + 03/14/2016 05:46:41 - 03 / 14/2016 04:09:04 )/ 2

这是我写的,这给了我一些为某些成员计算的负平均值,这是不对的:

with sessions as
( select 
    date_trunc('week', sc.occurred_at) as week,
    sc.occurred_at,
    sc.id,
    timestampdiff(second,lag(sc.occurred_at) over (order by sc.id asc, sc.occurred_at),
    sc.occurred_at)/3600 as session
  from state_changes sc
  where
    ((from_state = 'offline' and to_state = 'online') or
     (from_state = 'offline' and to_state = 'online'))
    and occurred_at at time zone 'America/New_york' > '2016-01-01'
)
select week, avg(session), id
from sessions
group by 1,3;

我可以将平均值汇总为单个值而不是成员,但我写的内容显然是错误的,因为少数平均值正在返回负数。有没有人有任何建议?

1 个答案:

答案 0 :(得分:0)

您基本上对从offline->online开始然后返回?->offline之间的时间段感兴趣。因此,诀窍是只在子查询中获取那些记录,然后在这两个上进行延迟。在这两个问题中,您的代码存在一些问题,请参阅下面的代码。然后在主查询中获得平均值并抛出offline->online行。

SELECT date_trunc('week', logout) AS week,
       avg(extract(epoch from logout - login)),  -- in seconds
       id
FROM (
  SELECT lag(occurred_at) OVER (PARTITION BY id ORDER BY occurred_at) AS login,
         occurred_at AS logout,
         id,
         to_state
  FROM state_change
  WHERE (from_state = 'offline' or to_state = 'offline')
    AND occurred_at > '2016-01-01') sub
WHERE to_state = 'offline'
GROUP BY 1,3;