是否可以在不使用子查询和加入的情况下找到薪水高于平均水平的员工?

时间:2018-03-09 06:55:49

标签: sql

这是我的查询,以查找薪水高于平均水平的员工。我使用子查询:

SELECT salary 
FROM Employee 
WHERE salary > (SELECT AVG(salary) FROM employee)

是否可以在不使用子查询和加入的情况下找到这些员工?

1 个答案:

答案 0 :(得分:2)

如果只显示高于平均水平的最高工资(而不是所有工资高于平均水平)的结果是可以接受的,那么这可以在没有子选择的情况下完成:

select salary, 
       salary - avg(salary) over () as diff_to_average,
       avg(salary) over () as average_salary 
from employees
order by 2 desc
fetch first 1 row only;

(以上是标准的ANSI SQL)

缺点是您无法删除diff_to_average列,因为您无法在同一级别的where子句中使用别名(您可以删除average_salaray )。然而,整个问题确实没有意义。

一个不使用子选择但只使用派生表的解决方案是:

select *
from (
  select salary, avg(salary) over () as average_salary
  from employees
) t
where salary > average_salary
order by salary;

派生表只是必需的,因为SQL不允许(重新)在同一级别的WHERE子句中使用列别名。

但是,根据DBMS,您的问题中的查询可能更有效,因为派生表中的窗口函数通常需要某种缓冲,这在使用您的问题中的子选择时不会发生。

我创建了一个包含三列的表:id,name ans salary和一百万行,然后比较两个查询。我没有在薪水栏上创建索引。

Postgres 10结果:

使用window函数的查询缓冲结果以评估它:

Sort  (cost=50423.64..51256.98 rows=333333 width=73) (actual time=598.267..608.075 rows=500409 loops=1)
  Sort Key: t.salary
  Sort Method: quicksort  Memory: 82659kB
  Buffers: shared hit=9346
  ->  Subquery Scan on t  (cost=0.00..19846.00 rows=333333 width=73) (actual time=218.982..454.620 rows=500409 loops=1)
        Filter: ((t.salary)::numeric > t.average_salary)
        Rows Removed by Filter: 499591
        Buffers: shared hit=9346
        ->  WindowAgg  (cost=0.00..13846.00 rows=1000000 width=73) (actual time=218.978..336.965 rows=1000000 loops=1)
              Buffers: shared hit=9346
              ->  Seq Scan on emp  (cost=0.00..10346.00 rows=1000000 width=41) (actual time=0.022..55.422 rows=1000000 loops=1)
                    Buffers: shared hit=9346
Planning time: 0.099 ms
Execution time: 671.334 ms

使用子查询的问题解决方案效率更高,因为它不需要任何中间内存:

Seq Scan on emp  (cost=12846.00..28192.00 rows=333333 width=41) (actual time=122.729..301.144 rows=500409 loops=1)
  Filter: ((salary)::numeric > $0)
  Rows Removed by Filter: 499591
  Buffers: shared hit=18692
  InitPlan 1 (returns $0)
    ->  Aggregate  (cost=12846.00..12846.00 rows=1 width=32) (actual time=122.715..122.715 rows=1 loops=1)
          Buffers: shared hit=9346
          ->  Seq Scan on emp emp_1  (cost=0.00..10346.00 rows=1000000 width=4) (actual time=0.004..54.477 rows=1000000 loops=1)
                Buffers: shared hit=9346
Planning time: 0.062 ms
Execution time: 309.586 ms

Oracle 12.1结果

Oracle执行计划看起来非常相似,Oracle也会在窗口函数的情况下缓冲结果:

SQL_ID  2x0xhkm1pkamz, child number 0
-------------------------------------
select * from (   select salary, avg(salary) over () as average_salary  
 from emp ) t where salary > average_salary order by salary

Plan hash value: 1471144246

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |       |       |  6660 (100)|    500K|00:00:01.02 |    6679 |       |       |          |
|   1 |  SORT ORDER BY       |      |      1 |    655K|    16M|    22M|  6660   (1)|    500K|00:00:01.02 |    6679 |    17M|  1562K|   15M (0)|
|*  2 |   VIEW               |      |      1 |    655K|    16M|       |  1812   (1)|    500K|00:00:00.79 |    6679 |       |       |          |
|   3 |    WINDOW BUFFER     |      |      1 |    655K|  8325K|       |  1812   (1)|   1000K|00:00:00.65 |    6679 |    34M|  2096K|   30M (0)|
|   4 |     TABLE ACCESS FULL| EMP  |      1 |    655K|  8325K|       |  1812   (1)|   1000K|00:00:00.09 |    6679 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$2 / T@SEL$1
   3 - SEL$2
   4 - SEL$2 / EMP@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("SALARY">"AVERAGE_SALARY")

与Postgres一样,使用子选择的查询在Oracle中也更有效:

SQL_ID  6fmzs2ru2cxa5, child number 1
-------------------------------------
select * from emp  where salary > (select avg(salary) from emp)

Plan hash value: 1876299339

-----------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |       |  1814 (100)|    500K|00:00:00.27 |   14347 |
|*  1 |  TABLE ACCESS FULL  | EMP  |      1 |    500K|    37M|     2   (0)|    500K|00:00:00.27 |   14347 |
|   2 |   SORT AGGREGATE    |      |      1 |      1 |    13 |            |      1 |00:00:00.18 |    6679 |
|   3 |    TABLE ACCESS FULL| EMP  |      1 |    655K|  8325K|  1812   (1)|   1000K|00:00:00.09 |    6679 |
-----------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / EMP@SEL$1
   2 - SEL$2
   3 - SEL$2 / EMP@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SALARY">)

因此,如果你的问题是:我正在寻找一个更有效的查询,那么答案是(至少对于上面的两个数据库):你的查询效率和它一样高。