Question

我有一个订单表。

+----------+--------+---------+
| order_id |  city  | status  |
+----------+--------+---------+
|        1 | NYC    | success |
|        2 | London | failure |
|        3 | Tokyo  | success |
|        4 | NYC    | failure |
|        5 | London | failure |
|        6 | Tokyo  | success |
|        7 | NYC    | success |
|        8 | London | failure |
|        9 | Tokyo  | success |
|       10 | NYC    | failure |
+----------+--------+---------+

我想编写一个查询，以根据故障率升序列出所有城市。

城市的故障率=（城市的失败订单）* 100 /（城市的总订单）

上表查询的输出应为：

+--------+--------------+
|  city  | failure_rate |
+--------+--------------+
| Tokyo  |            0 |
| NYC    |           50 |
| London |          100 |
+--------+--------------+

现在，作为菜鸟，我只能编写查询来获取按城市分组的失败订单数和按城市分组的总订单数，如下所示：

select city, count(order_id) as failed
from orders
where status='failure'
group by city
order by failed asc;

和

select city, count(order_id) as total
from orders
group by city
order by total asc;

但是我无法编写查询来获得所需的结果。

Answer 1

dataset['gc'] = grouper['LastTime'].transform('count')

Answer 2

我们可以根据同一filter中的条件使用group by进行多个计数。

select c.*,(failed*100)/total failure_rate
  from (select city
              ,count(*) total
              ,count(order_id) filter (where status='failure') failed
          from orders
        group by city
       ) c
order by failed

Answer 3

我喜欢为此目的使用avg()。在通用SQL中：

select city, 
       avg(case when status = 'failure' then 1.0 else 0 end) as fail_rate
from orders
group by city
order by fail_rate asc;

还可以根据数据库简化计算。例如，在MySQL中：

       avg( status = 'failure' ) as fail_rate

在Postres / Redshift中：

       avg( (status = 'failure')::int ) as fail_rate

在订单表中计算每个城市的故障率

3 个答案: