我在PostgreSQL数据库中有以下两个表:
dummy=# select * from employee;
id | name
----+-------
1 | John
2 | Susan
3 | Jim
4 | Sarah
(4 rows)
dummy=# select * from stats;
id | arrival | day | employee_id
----+----------+------------+-------------
2 | 08:31:34 | monday | 2
4 | 08:15:00 | monday | 3
5 | 08:43:00 | monday | 4
1 | 08:34:00 | monday | 1
7 | 08:29:00 | midweek | 1
8 | 08:31:00 | midweek | 2
9 | 08:10:00 | midweek | 3
10 | 08:40:00 | midweek | 4
11 | 08:28:00 | midweek | 1
12 | 08:33:00 | midweek | 2
14 | 08:21:00 | midweek | 3
15 | 08:45:00 | midweek | 4
16 | 08:25:00 | midweek | 1
17 | 08:35:00 | midweek | 2
18 | 08:44:00 | midweek | 4
19 | 08:10:00 | friday | 1
20 | 08:40:00 | friday | 2
21 | 08:30:00 | friday | 3
22 | 08:30:00 | friday | 4
(19 rows)
我想选择在8:25
和8:35
之间midweek
和friday
之间到达的所有员工。我可以通过以下查询完成相对简单的操作:
SELECT * FROM stats
WHERE
arrival >= (time '8:30' - interval '5 minutes')
AND
arrival <= (time '8:30' + interval '5 minutes')
AND
(day = 'midweek' or day = 'friday');
然而,另一个标准是我只想选择那些在上述时间窗口内至少有60%时间到达的员工。这是我被困的地方。我不知道如何计算这个比例。
查询符合所有条件的内容是什么?
澄清
显然上述比率的描述具有误导性。
在计算比率时,只应考虑符合标准(day = 'midweek' or day = 'friday')
的行。因此,在示例数据中,John和Susan在midweek
和friday
上出现了四次工作。这四次中有三次是准时的。因此,苏珊和约翰的比例为75%
。
答案 0 :(得分:1)
使用公用表表达式计算所需的计数,例如
tag<func, tag2, tag3>::value == 0
结果:
with in_time as (
select *
from stats
where arrival >= (time '8:30' - interval '5 minutes')
and arrival <= (time '8:30' + interval '5 minutes')
and (day = 'midweek' or day = 'friday')
),
count_in_time as (
select employee_id, count(*)
from in_time
group by employee_id
),
total_count as (
select employee_id, count(*)
from stats
where day = 'midweek' or day = 'friday'
group by employee_id
)
select
i.*,
c.count as in_time,
t.count as total_count,
round(c.count* 100.0/t.count, 2) as ratio
from in_time i
join count_in_time c using(employee_id)
join total_count t using(employee_id);
您可以在最终查询的WHERE子句中添加适当的条件。
如果您只希望获得员工及其比率的汇总数据,请将count()与过滤器一起使用:
id | arrival | day | employee_id | in_time | total_count | ratio
----+----------+---------+-------------+---------+-------------+-------
16 | 08:25:00 | midweek | 1 | 3 | 4 | 75.00
11 | 08:28:00 | midweek | 1 | 3 | 4 | 75.00
7 | 08:29:00 | midweek | 1 | 3 | 4 | 75.00
17 | 08:35:00 | midweek | 2 | 3 | 4 | 75.00
12 | 08:33:00 | midweek | 2 | 3 | 4 | 75.00
8 | 08:31:00 | midweek | 2 | 3 | 4 | 75.00
21 | 08:30:00 | friday | 3 | 1 | 3 | 33.33
22 | 08:30:00 | friday | 4 | 1 | 4 | 25.00
(8 rows)
答案 1 :(得分:0)
您可以像这样获得到达率,例如:
SELECT name,
AVG(CASE WHEN arrival >= (time '8:30' - interval '5 minutes') AND
arrival <= (time '8:30' + interval '5 minutes') THEN 1 ELSE 0 END) AS arrival_rate
FROM employee
INNER JOIN stats ON stats.employee_id = employee.id
GROUP BY name
并且只选择那些费率&gt; 60%你只是使用条件
SELECT name,
AVG(CASE WHEN arrival >= (time '8:30' - interval '5 minutes') AND
arrival <= (time '8:30' + interval '5 minutes') THEN 1 ELSE 0 END) AS arrival_rate
FROM employee
INNER JOIN stats ON stats.employee_id = employee.id
GROUP BY name
HAVING
AVG(CASE WHEN arrival >= (time '8:30' - interval '5 minutes') AND
arrival <= (time '8:30' + interval '5 minutes') THEN 1 ELSE 0 END)
> 0.6