我有一个items
表,其中包含状态和创建日期
+----+-----------+------------+
| id | status | created |
+----+-----------+------------+
| 1 | PROCESSED | 2018-12-01 |
+----+-----------+------------+
| 2 | PROCESSED | 2018-12-01 |
+----+-----------+------------+
| 3 | ABORTED | 2018-12-01 |
+----+-----------+------------+
有一个相应的item status
表,该表在状态更改时会更新
+----+---------+-----------+------------------+
| id | item_id | status | created |
+----+---------+-----------+------------------+
| 1 | 1 | RECEIVED | 2018-12-01 10:00 |
+----+---------+-----------+------------------+
| 2 | 1 | PROCESSED | 2018-12-01 12:00 |
+----+---------+-----------+------------------+
| 3 | 2 | RECEIVED | 2018-12-01 11:00 |
+----+---------+-----------+------------------+
| 4 | 2 | PROCESSED | 2018-12-01 12:00 |
+----+---------+-----------+------------------+
| 5 | 3 | RECEIVED | 2018-12-01 13:00 |
+----+---------+-----------+------------------+
| 6 | 3 | ABORTED | 2018-12-01 13:30 |
+----+---------+-----------+------------------+
我想生成一个报告,该报告按天分组显示要处理的项目的平均时间,不包括已中止的项目。 (要处理项目的时间是RECEIVED
和PROCESSED
之间的时间差)
类似的事情(持续时间以秒为单位):
+------------+------------------+
| day | avg_duration |
+------------+------------------+
| 2018-12-01 | 5400 |
+------------+------------------+
从其他问题来看,我已经确定可以使用表分区解决此问题,但是还无法编写有效的查询。最好的方法是什么?
答案 0 :(得分:2)
使用布尔聚合bool_and()
过滤掉中止的项目:
select date, avg(duration)
from (
select created::date as date, item_id, extract(epoch from max(created)- min(created)) as duration
from item_status
group by created::date, item_id
having bool_and(status <> 'ABORTED')
) s
group by date
date | avg
------------+------
2018-12-01 | 5400
(1 row)
答案 1 :(得分:1)
这需要2个汇总级别,一次是在项目和日期上,然后是在日期上。
select dt_created,avg(diff) as avg_diff
from (select item_id
,created::date as dt_created
,max(case when status = 'PROCESSED' then created end) - max(case when status = 'RECEIVED' then created end) as diff
from item_statuses
group by item_id,created::date
having count(case when status = 'ABORTED' then 1 end) = 0
) t
group by dt_Created