如何在SELECT语句中重用子查询的结果

时间:2019-02-26 18:47:19

标签: sql postgresql

我一直在为大学课程工作整理一些数据,我正在寻找优化查询的方法。

我使用的数据集是英国国家警察停止和搜索的数据,我正在尝试获取种族与停止和搜索所占份额之间的相关性。

我有一个查询,它将针对每个警察部队和种族组合找到搜索总数,同一支部队在该种族上与其他种族相比的搜索百分比,全国平均百分比以及该部队平均值与全国平均水平(我知道这很令人困惑)。

这是我当前的“有效”查询:

SELECT c1.FORCE,
       c1.ETHNICITY,
       (SELECT COUNT(*) FROM CRIMES WHERE FORCE = c1.FORCE AND ETHNICITY = c1.ETHNICITY) AS num_searches,
       (ROUND(((SELECT COUNT(*) FROM CRIMES WHERE FORCE = c1.FORCE AND ETHNICITY = c1.ETHNICITY) /
           (SELECT COUNT(*) FROM CRIMES WHERE FORCE = c1.FORCE)::DECIMAL), 4) * 100) AS percentage_of_force,
       (SELECT ROUND((COUNT(*) / 303565::DECIMAL) * 100, 4) FROM CRIMES WHERE ETHNICITY = c1.ETHNICITY GROUP BY ETHNICITY) AS national_average,
       (SELECT (ROUND(((SELECT COUNT(*) FROM CRIMES WHERE FORCE = c1.FORCE AND ETHNICITY = c1.ETHNICITY) /
           (SELECT COUNT(*) FROM CRIMES WHERE FORCE = c1.FORCE)::DECIMAL), 4) * 100) - (SELECT ROUND((COUNT(*) / 303565::DECIMAL) * 100, 4) FROM CRIMES WHERE ETHNICITY = c1.ETHNICITY GROUP BY ETHNICITY)) AS difference_from_average
FROM (SELECT * FROM CRIMES) AS c1
GROUP BY c1.FORCE, c1.ETHNICITY
ORDER BY c1.FORCE, c1.ETHNICITY;

所以我要解决的问题是围绕多次在“ SELECT”部分重复使用同一查询。

从上面的查询中可以看到,difference_from_average只是percentage_of_force减去national_average的结果,但是我似乎无法找出一种一次性计算这些值的方法,然后在SELECT部分的其他地方重用它们。所以我的问题是如何实现呢?

其他信息

示例输入数据

| date       | ethnicity | force           |
|------------|-----------|-----------------|
| 2018-01-01 | White     | metropolitan    |
| 2018-01-01 | White     | west-yorkshire  |
| 2018-01-01 | White     | metropolitan    |
| 2018-01-01 | White     | metropolitan    |
| 2018-01-01 | White     | north-yorkshire |
| 2018-01-01 | White     | west-yorkshire  |
| 2018-01-01 | Black     | metropolitan    |
| 2018-01-01 | Undefined | metropolitan    |
| 2018-01-01 | White     | metropolitan    |
| 2018-01-01 | White     | metropolitan    |
| 2018-01-01 | White     | norfolk         |
| 2018-01-01 | White     | north-yorkshire |
| 2018-01-01 | White     | northumbria     |
| 2018-01-01 | White     | west-yorkshire  |
| 2018-01-01 | Black     | metropolitan    |
| 2018-01-01 | Black     | metropolitan    |
| 2018-01-01 | Black     | metropolitan    |
| 2018-01-01 | Black     | metropolitan    |
| 2018-01-01 | White     | metropolitan    |
| 2018-01-01 | Black     | metropolitan    |

示例查询结果

| force             | ethnicity | num_searches | percentage_of_force | national_average | difference_from_average |
|-------------------|-----------|--------------|---------------------|------------------|-------------------------|
| avon-and-somerset | Asian     | 41           | 2.88                | 13.0641          | -10.1841                |
| avon-and-somerset | Black     | 223          | 15.64               | 25.6798          | -10.0398                |
| avon-and-somerset | Other     | 66           | 4.63                | 2.7368           | 1.8932                  |
| avon-and-somerset | Undefined | 184          | 12.9                | 7.4699           | 5.4301                  |
| avon-and-somerset | White     | 912          | 63.96               | 50.941           | 13.019                  |
| bedfordshire      | Asian     | 440          | 23.31               | 13.0641          | 10.2459                 |
| bedfordshire      | Black     | 373          | 19.76               | 25.6798          | -5.9198                 |
| bedfordshire      | Mixed     | 2            | 0.11                | 0.1084           | 0.0016                  |
| bedfordshire      | Other     | 33           | 1.75                | 2.7368           | -0.9868                 |
| bedfordshire      | Undefined | 97           | 5.14                | 7.4699           | -2.3299                 |
| bedfordshire      | White     | 943          | 49.95               | 50.941           | -0.991                  |
| btp               | Asian     | 301          | 7.14                | 13.0641          | -5.9241                 |
| btp               | Black     | 1274         | 30.23               | 25.6798          | 4.5502                  |
| btp               | Other     | 71           | 1.68                | 2.7368           | -1.0568                 |
| btp               | Undefined | 48           | 1.14                | 7.4699           | -6.3299                 |
| btp               | White     | 2521         | 59.81               | 50.941           | 8.869                   |

我正在使用PostgreSQL v11.2。

2 个答案:

答案 0 :(得分:1)

有多种简化查询的方法。您可以使用一系列CTE来针对不同级别的聚合预先计算结果。但是我认为最有效,最易读的方法是使用窗口函数。

可以使用带有各种COUNT(...) OVER(...)选项的PARTITION BY在子查询中计算所有中间计数,如下所示:

SELECT
    force,
    ethnicity,
    COUNT(*) OVER(PARTITION BY force, ethnicity) AS cnt,
    COUNT(*) OVER(PARTITION BY force) AS cnt_force,
    COUNT(*) OVER(PARTITION BY ethnicity) AS cnt_ethnicity,
    ROW_NUMBER() OVER(PARTITION BY force, ethnicity) AS rn
FROM crimes

然后,外部查询可以计算最终结果(同时在每个force / ethnicity元组中的第一条记录上进行过滤,以避免重复)。

查询:

SELECT 
    force,
    ethnicity,
    cnt AS num_searches,
    ROUND(cnt / cnt_force::decimal * 100, 4) AS percentage_of_force,
    ROUND(cnt_ethnicity / 303565::decimal * 100, 4) AS national_average,
    ROUND(cnt / cnt_force::decimal * 100, 4) 
        - ROUND(cnt_ethnicity / 303565::decimal * 100, 4) AS difference_from_average
FROM (
    SELECT
        force,
        ethnicity,
        COUNT(*) OVER(PARTITION BY force, ethnicity) AS cnt,
        COUNT(*) OVER(PARTITION BY force) AS cnt_force,
        COUNT(*) OVER(PARTITION BY ethnicity) AS cnt_ethnicity,
        ROW_NUMBER() OVER(PARTITION BY force, ethnicity) AS rn
    FROM crimes
    ) x
WHERE rn = 1
ORDER BY force, ethnicity;

Demo on DB Fiddle

| force           | ethnicity | num_searches | percentage_of_force | national_average | difference_from_average |
| --------------- | --------- | ------------ | ------------------- | ---------------- | ----------------------- |
| metropolitan    | Black     | 6            | 46.1538             | 0.0020           | 46.1518                 |
| metropolitan    | Undefined | 1            | 7.6923              | 0.0003           | 7.6920                  |
| metropolitan    | White     | 6            | 46.1538             | 0.0043           | 46.1495                 |
| norfolk         | White     | 1            | 100.0000            | 0.0043           | 99.9957                 |
| north-yorkshire | White     | 2            | 100.0000            | 0.0043           | 99.9957                 |
| northumbria     | White     | 1            | 100.0000            | 0.0043           | 99.9957                 |
| west-yorkshire  | White     | 3            | 100.0000            | 0.0043           | 99.9957                 |

答案 1 :(得分:0)

诀窍是使用子选择:

SELECT f(a, b), a, c
FROM (SELECT g(c, d) AS a,
             h(c) AS b, 
             c, d
      FROM x) AS q;

您明白了。