Question

是否可以执行以下操作：

select
  avg(count(distinct user_id))
    over (partition by some_date) as average_users_per_day
from user_activity
group by user_type

（特别是partition by列some_date不在group by列中）

我要的想法是：每天按用户类型划分的平均用户数。

我知道如何使用子查询（请参见下文），但是我想知道是否有一种仅使用over (partition by ...)和group by的好方法。

注意：

通过阅读this answer，我的理解（如果我错了，请纠正我）是以下查询：

select
  avg(count(distinct a)) over (partition by b)
from foo
group by b

可以等效地扩展为：

select
  avg(count_distinct_a)
from (
  select
    b,
    count(distinct a) as count_distinct_a
  from foo
  group by b
)
group by b

然后，我可以对其进行一些调整以实现我想要的：

select
  avg(count_distinct_user_id) as average_users_per_day
from (
  select
    user_type,
    count(distinct user_id) as count_distinct_user_id
  from user_activity
  group by user_type, some_date
)
group by user_type

（值得注意的是，内部group by user_type, some_date与外部group by user_type不同）

我希望能够告诉partition by-group by交互使用“ sub-group-by”作为窗口部分。请让我知道我对partition by / group by的理解是否完全不正确。

编辑：一些示例数据和所需的输出。

源表：

+---------+-----------+-----------+
| user_id | user_type | some_date |
+---------+-----------+-----------+
| 1       | a         | 1         |
| 1       | a         | 2         |
| 2       | a         | 1         |
| 3       | a         | 2         |
| 3       | a         | 2         |
| 4       | b         | 2         |
| 5       | b         | 1         |
| 5       | b         | 3         |
| 5       | b         | 3         |
| 6       | c         | 1         |
| 7       | c         | 1         |
| 8       | c         | 4         |
| 9       | c         | 2         |
| 9       | c         | 3         |
| 9       | c         | 4         |
+---------+-----------+-----------+

示例中间表（用于推理）：

+-----------+-----------+---------------------+
| user_type | some_date | distinct_user_count |
+-----------+-----------+---------------------+
| a         | 1         | 2                   |
| a         | 2         | 2                   |
| b         | 1         | 1                   |
| b         | 2         | 1                   |
| b         | 3         | 1                   |
| c         | 1         | 2                   |
| c         | 2         | 1                   |
| c         | 3         | 1                   |
| c         | 4         | 2                   |
+-----------+-----------+---------------------+

SQL是：select user_type, some_date, count(distinct user_id) from user_activity group by user_type, some_date。

所需结果：

+-----------+---------------------+
| user_type | average_daily_users |
+-----------+---------------------+
| a         | 2                   |
| b         | 1                   |
| c         | 1.5                 |
+-----------+---------------------+

蜂巢-超过（按...划分）的列不在分组依据中

注意：

0 个答案: