SQL聚合函数别名

时间:2016-11-10 21:01:49

标签: sql postgresql case aggregate-functions having

我是SQL的初学者,这是我被要求解决的问题:

  

假设一个大城市被定义为类型为place的{​​{1}},其人口为   至少100,000。编写一个SQL查询,返回由city排序的方案(state_name,no_big_city,big_city_population),列出那些拥有(a)至少五个大城市或(b)至少一百万人居住在大城市的州。列state_namestate_name的{​​{1}},name是州内大城市的数量,state是居住在此州的人数该州的大城市。

现在,据我所知,以下查询返回正确的结果:

no_big_city

但是,代码中使用的两个聚合函数出现两次。我的问题:有没有办法让这段代码重复消失,保留功能?

要清楚,我已经尝试过使用别名,但我只是得到了“列不存在”错误。

3 个答案:

答案 0 :(得分:4)

The manual clarifies:

  

输出列的名称可用于指代列中的值   ORDER BYGROUP BY条款,但不在WHERE HAVING 条款中;   你必须写出表达式。

大胆强调我的。

您可以避免使用子查询或CTE重复键入长表达式:

SELECT state_name, no_big_city, big_city_population
FROM  (
   SELECT s.name AS state_name
        , COUNT(*)        FILTER (WHERE p.type = 'city' AND p.population >= 100000) AS no_big_city
        , SUM(population) FILTER (WHERE p.type = 'city' AND p.population >= 100000) AS big_city_population
   FROM   state s
   JOIN   place p ON s.code = p.state_code
   GROUP  BY s.name -- can be input column name as well, best schema-qualified to avoid ambiguity
   ) sub
WHERE  no_big_city >= 5
   OR  big_city_population >= 1000000
ORDER  BY state_name;

在参与其中时,我简化了聚合FILTER子句(Postgres 9.4 +):

但是,我建议这个更简单,更快速的查询开头:

SELECT s.state_name, p.no_big_city, p.big_city_population
FROM   state s
JOIN  (
   SELECT state_code      AS code  -- alias just to simplify join
        , count(*)        AS no_big_city
        , sum(population) AS big_city_population
   FROM   place
   WHERE  type = 'city'
   AND    population >= 100000
   GROUP  BY 1  -- can be ordinal number referencing position in SELECT list
   HAVING count(*) >= 5 OR sum(population) >= 1000000  -- simple expressions now
   ) p USING (code)
ORDER  BY 1;    -- can also be ordinal number

我正在演示在GROUP BYORDER BY中引用表达式的另一种选择。只有在不损害可读性和可维护性的情况下才使用它。

答案 1 :(得分:1)

不确定这是评论还是答案,因为它更偏向于技术,但我会发布它

当我需要引用计算列(通常是LOT同时)时,我通常会做的是将计算列放在派生表中,然后使用派生表外的别名引用计算列。这个语法应该是ANSI-SQL正确的,但我不熟悉PostGRES

select * from (

SELECT STATE.NAME AS state_name
    ,COUNT(CASE WHEN place.type = 'city'
                AND place.population >= 100000 THEN 1 ELSE NULL END) AS no_big_city
    ,SUM(CASE WHEN place.type = 'city'
                AND place.population >= 100000 THEN place.population ELSE NULL END) AS big_city_population
FROM STATE
INNER JOIN place
    ON STATE.code = place.state_code
    GROUP BY state_name
) sub 
    where no_big_city >= 5 
        and big_city_population >=100000

--HAVING COUNT(CASE WHEN place.type = 'city'
--          AND place.population >= 100000 THEN 1 ELSE NULL END) >= 5
--  OR SUM(CASE WHEN place.type = 'city'
--              AND place.population >= 100000 THEN place.population ELSE NULL END) >= 1000000
ORDER BY state_name;

这种方法的好处是,虽然您通过子查询/派生表添加复杂性,但公式保存在一个位置,因此任何更改只需要发生一次。我不知道这是否会比简单地重复计算更糟糕,但我无法想象它会更糟糕。

答案 2 :(得分:0)

SELECT子句是您想要从WHERE子句表中选择过滤的 GROUP BY是如何对过滤记录进行分组以在SELECT中的聚合函数中使用的条件。所以别名不能存在。 但是您可以包装已过滤的记录并从中进行选择。这样的事情:

SELECT state_name, no_big_city, big_city_population 
FROM
 (
   SELECT 
     state.name AS state_name,     
     COUNT(1) no_big_city,
     MAX(place.population) max_city_population,
     SUM(place.population) AS big_city_population
   FROM state JOIN place ON state.code = place.state_code 
   WHERE   
     place.type = 'city' AND
     place.population >= 100000
   GROUP BY  state.name
  )
WHERE 
   no_big_city >= 5 OR
   max_city_population > 1000000
ORDER BY state_name

此外,移动条件

   place.type = 'city' AND
   place.population >= 100000

从CASE到WHERE将表现更好。 “没有城市”或“小城市记录将不会被处理。特别是如果place.type列上有索引。