计算每个事件的每个用户的事件计数,包括计数0

时间:2017-11-01 14:17:59

标签: sql presto

我有一个包含用户事件的表,另一个表包含系统中所有事件的名称。

我需要的是每个用户的每个事件计数,包括他们有0的事件。

Dialect是ANSI SQL,不确定版本。该数据库是Presto 0.186。

以下是一个例子:

with
event_names (name) as (values
  ('event_1'), ('event_2'), ('event_3'), ('event_4')
)
, events (user_id, event_name, occurred_at) as (values
    ('id1', 'event_1', timestamp '2017-10-10 00:01:00')
  , ('id1', 'event_2', timestamp '2017-10-10 00:02:00')
  , ('id1', 'event_2', timestamp '2017-10-10 00:03:00')
  , ('id2', 'event_2', timestamp '2017-10-11 00:01:00')
  , ('id2', 'event_3', timestamp '2017-10-11 00:02:00')
  , ('id2', 'event_3', timestamp '2017-10-11 00:03:00')
  , ('id2', 'event_4', timestamp '2017-10-11 00:03:00')
  , ('id3', 'event_1', timestamp '2017-10-12 00:03:00')
  , ('id3', 'event_4', timestamp '2017-10-12 00:04:00')
)

select user_id, event_name, count(*) as event_count, sum(count(*)) over (partition by user_id) as total_events
from events
group by 1, 2
order by 1, 2;

此查询自然只给出了用户确实发送的事件的计数:

 user_id | event_name | event_count 
---------+------------+-------------
 id1     | event_1    |           1 
 id1     | event_2    |           2 

 id2     | event_2    |           1 
 id2     | event_3    |           2 
 id2     | event_4    |           1 

 id3     | event_1    |           1 
 id3     | event_4    |           1 

我需要的是以下内容:

 user_id     |  name   | event_count 
-------------+---------+-------------
 id1         | event_1 |           1
 id1         | event_2 |           2
 id1         | event_3 |           0
 id1         | event_4 |           0

 id2         | event_1 |           0
 id2         | event_2 |           1
 id2         | event_3 |           2
 id2         | event_4 |           0

 id3         | event_1 |           1
 id3         | event_2 |           0
 id3         | event_3 |           0
 id3         | event_4 |           1

1 个答案:

答案 0 :(得分:2)

使用cross join生成所有行。然后引入存在的数据:

select u.user_id, en.event_name, count(e.user_id) as event_count,
       sum(count(e.user_id)) over (partition by user_id) as total_events
from (select distinct user_id from events) u cross join
     (select distinct event_name from events) en left join
     events e
     on e.user_id = u.user_id and e.event_name = en.event_name
group by 1, 2
order by 1, 2;

如果您有其他表包含用户或事件列表,那么您可以使用这些表而不是子查询。