Question

考虑一个具有PostgresQL数据库表结构的表（在此处为了演示目的而被删除）：

time:  Timestamp
type:  Text
value: Integer

随着时间的推移，这个表会被许多事件填满。现在，我们需要一个用于报告的SQL staemtent，它按时间（例如每小时）聚合值并对特定类型进行平均和计数。所以报告看起来像这样：

[example report for type="standard"]  
Time    Count   Avg   
00:00   30      20
01:00   12      24
02:00   9       19
...

直到这里它非常简单，所以上述报告的陈述是：

select extract(hour from time) time, count(1), avg(value)
from reportdata
where type = 'standard'
group by time;

现在是棘手的部分 - 我需要显示每种类型的报告，其中包含所有类型的总计数以及与每个时间范围内特定类型相关的百分比。为此，我需要一个语句，它为每个时间帧和每个可能的类型（可以从包含所有可能类型的单独表中选择）生成一行，然后代码可以从中提取每种类型的报告选项卡而无需再次从数据库请求。所以结果应该是这样的（注意时间框架的“空”行，其中没有找到该类型的值）：

[example report for all types assuming there are 3 possible types]  
Time    Type       Total   Count   Percent   Avg   
00:00   standard   40      30      0.75      20
00:00   special    40      10      0.25      8
00:00   super      40      0       0         0
01:00   standard   12      12      1.0       24
01:00   special    12      0       0         0
01:00   super      12      0       0         0
02:00   standard   9       3       0.33      20
02:00   special    9       0       0         0
02:00   super      9       6       0.67      15
...

声明如何产生？

Answer 1

select 
    s.hour as "Time", 
    s.type as "Type", 
    s.total as "Total", 
    coalesce(r.total, 0) as "Count", 
    round(coalesce(r.total, 0) * 1.0/s.total, 2) as "Percent", 
    round(coalesce(r.avg, 0), 2) as "Avg"
from (
    select 
        date_trunc('hour', time) as hour, 
        type, 
        count(*) as total, 
        avg(value) as avg
    from reportdata
    group by hour, type
    ) r
right outer join (
    select 
        date_trunc('hour', time) as hour,
        t.type,
        count(*) as total
    from reportdata
    inner join type t on true
    group by hour, t.type
    ) s on s.hour = r.hour and s.type = r.type
order by s.hour, s.type
;
        Time         |   Type   | Total | Count | Percent |  Avg  
---------------------+----------+-------+-------+---------+-------
 2012-04-02 00:00:00 | special  |    40 |    10 |    0.25 |  8.00
 2012-04-02 00:00:00 | standard |    40 |    30 |    0.75 | 20.00
 2012-04-02 00:00:00 | super    |    40 |     0 |    0.00 |  0.00
 2012-04-02 01:00:00 | special  |    12 |     0 |    0.00 |  0.00
 2012-04-02 01:00:00 | standard |    12 |    12 |    1.00 | 24.00
 2012-04-02 01:00:00 | super    |    12 |     0 |    0.00 |  0.00
 2012-04-02 02:00:00 | special  |     9 |     0 |    0.00 |  0.00
 2012-04-02 02:00:00 | standard |     9 |     3 |    0.33 | 20.00
 2012-04-02 02:00:00 | super    |     9 |     6 |    0.67 | 15.00
(9 rows)

我在时间戳上使用了date_trunc，因为我认为你想要的是隔离每一天的每个小时。如果将所有日期的每个小时聚合在一起，那么您只需要恢复为extract

更新以匹配评论中的新要求：

select 
    s.hour as "Time", 
    s.type as "Type", 
    s.total as "Total", 
    coalesce(r.total, 0) as "Count", 
    case s.total when 0 then round(0, 2) else
        round(coalesce(r.total, 0) * 1.0/s.total, 2)
        end as "Percent", 
    round(coalesce(r.avg, 0), 2) as "Avg"
from (
    select 
        date_trunc('hour', time) as hour, 
        type, 
        count(*) as total, 
        avg(value) as avg
    from reportdata
    group by hour, type
) r
right outer join (
    select 
        date_trunc('hour', d) as hour,
        t.type,
        count(r.time) as total
    from reportdata r
    right outer join (
        select d 
        from generate_series(
            (select min(time) from reportdata),
            (select max(time) from reportdata),
            '1 hour'
        ) g(d)
    ) g on date_trunc('hour', g.d) = date_trunc('hour', r.time)
    inner join type t on true
    group by hour, t.type
) s on s.hour = r.hour and s.type = r.type
order by s.hour, s.type
;
        Time         |   Type   | Total | Count | Percent |  Avg  
---------------------+----------+-------+-------+---------+-------
 2012-04-01 22:00:00 | special  |     1 |     0 |    0.00 |  0.00
 2012-04-01 22:00:00 | standard |     1 |     1 |    1.00 | 10.00
 2012-04-01 22:00:00 | super    |     1 |     0 |    0.00 |  0.00
 2012-04-01 23:00:00 | special  |     0 |     0 |    0.00 |  0.00
 2012-04-01 23:00:00 | standard |     0 |     0 |    0.00 |  0.00
 2012-04-01 23:00:00 | super    |     0 |     0 |    0.00 |  0.00
 2012-04-02 00:00:00 | special  |    40 |    10 |    0.25 |  8.00
 2012-04-02 00:00:00 | standard |    40 |    30 |    0.75 | 20.00
 2012-04-02 00:00:00 | super    |    40 |     0 |    0.00 |  0.00
 2012-04-02 01:00:00 | special  |    12 |     0 |    0.00 |  0.00
 2012-04-02 01:00:00 | standard |    12 |    12 |    1.00 | 24.00
 2012-04-02 01:00:00 | super    |    12 |     0 |    0.00 |  0.00
 2012-04-02 02:00:00 | special  |     9 |     0 |    0.00 |  0.00
 2012-04-02 02:00:00 | standard |     9 |     3 |    0.33 | 20.00
 2012-04-02 02:00:00 | super    |     9 |     6 |    0.67 | 15.00
 2012-04-02 03:00:00 | special  |     0 |     0 |    0.00 |  0.00
 2012-04-02 03:00:00 | standard |     0 |     0 |    0.00 |  0.00
 2012-04-02 03:00:00 | super    |     0 |     0 |    0.00 |  0.00
 2012-04-02 04:00:00 | special  |     1 |     0 |    0.00 |  0.00
 2012-04-02 04:00:00 | standard |     1 |     1 |    1.00 | 10.00
 2012-04-02 04:00:00 | super    |     1 |     0 |    0.00 |  0.00
(21 rows)

Answer 2

这样的东西？

select extract(hour from rd.time) time, 
       at.type,
       count(at.value) over (partition by extract(hour from time)) as total,
       count(at.value) over (partition by rd.type) as count,
       avg(value) over (partition by rd.type) as avg,
from all_types at 
  left join reportdata rd on at.type = rd.type
group by time, at.type;

（all_types是“包含所有可能类型的单独表格”）

PostgresQL：需要声明帮助

2 个答案:

更新以匹配评论中的新要求：