Question

我有一张包含id，year和count的表格。

我想为每个MAX(count)获取id并在year发生时保留SELECT id, year, MAX(count) FROM table GROUP BY id;，因此我提出以下问题：

SELECT id, year, MAX(count)
FROM table
GROUP BY id, year;

不幸的是，它给了我一个错误：

错误：列“table.year”必须出现在GROUP BY子句中或者是用于聚合函数

所以我试试：

MAX(count)

但是，它不会year，它只是显示表格。我想，因为在按id和id进行分组时，它会获得该特定年份id的最大值。

那么，我该如何编写该查询呢？我希望获得MAX(count)的{{1}}以及发生这种情况的年份。

Answer 1

最短（也可能是最快）的查询将使用DISTINCT ON，即SQL标准DISTINCT子句的PostgreSQL扩展：

SELECT DISTINCT ON (1)
       id, count, year
FROM   tbl
ORDER  BY 1, 2 DESC, 3;

数字是指SELECT列表中的序号位置。为清晰起见，您可以拼写列名：

SELECT DISTINCT ON (id)
       id, count, year
FROM   tbl
ORDER  BY id, count DESC, year;

结果按id排序，欢迎或不欢迎。在任何情况下，它都比“未定义”更好。

它还以明确定义的方式打破关系（当多年共享相同的最大数量时）：选择最早的年份。如果您不在乎，请从year中删除ORDER BY。或者使用year DESC选择最近一年。

在这个密切相关的答案中有更多解释，链接，基准和可能更快的解决方案：

Select first row in each GROUP BY group?

除此之外：在现实生活中，您不会使用某些列名称。 id是列名称的非描述性反模式，count是reserved word in standard SQL，是Postgres中的聚合函数。

Answer 2

select *
from (
  select id, 
         year,
         thing,
         max(thing) over (partition by id) as max_thing
  from the_table
) t
where thing = max_thing

或：

select t1.id,
       t1.year,
       t1.thing
from the_table t1
where t1.thing = (select max(t2.thing) 
                  from the_table t2
                  where t2.id = t1.id);

或

select t1.id,
       t1.year,
       t1.thing
from the_table t1
  join ( 
    select id, max(t2.thing) as max_thing
    from the_table t2
    group by id
  ) t on t.id = t1.id and t.max_thing = t1.thing

或（与之前的其他符号相同）

with max_stuff as (
  select id, max(t2.thing) as max_thing
  from the_table t2
  group by id
) 
select t1.id, 
       t1.year,
       t1.thing
from the_table t1
  join max_stuff t2 
    on t1.id = t2.id 
   and t1.thing = t2.max_thing

PostgreSQL MAX和GROUP BY

2 个答案: