Question

目标是使用两个不同的表进行查询;国家和城市。国家/地区包含（国家/地区）名称和country_code（主键），城市包含名称（城市），人口和country_code（主键）。我想使用聚合函数GROUP BY，但我下面的查询不起作用。

对于每个国家/地区，列出其所有城市的最大人口和该城市的名称。所以我需要列出每个国家人口最多的城市。

那么应该显示的是国家，城市（人口最多），然后是该城市的人口。每个城市应该只有一个国家。

$query6 = "SELECT c.name AS country, ci.name AS city,
GREATEST(ci.population) AS max_pop
FROM lab6.country c INNER JOIN lab6.city ci
ON(c.country_code = ci.country_code)
GROUP BY c.name
ORDER BY country ASC";

我也试过GROUP BY国家，DISTINCT c.name。

我是聚合函数的新手，所以如果有特定的情况你想使用GROUP BY而这不是其中之一，请告诉我。

我正在使用PHP来运行查询：

$result = pg_query($connection, $query);
if(!$result)
{
       die("Failed to connect to database");
}

错误：列“ci.name”必须出现在GROUP BY子句中或用于聚合函数LINE 1：SELECT DISTINCT c.name AS country，ci.name AS city，是错误。

这些表格是给我们的，我们不会制作它们，而且我不能包括制作表格的屏幕截图，因为我没有任何声誉。

Answer 1

可以使用一些DDL。

create table country (
  country_code char(2) primary key, -- ISO country code
  country_name varchar(35) not null unique
);

insert into country values 
('US', 'United States of America'),
('IT', 'Italy'),
('IN', 'India');

-- The full name of a city is more than city name plus country name.
-- In the US, there are a couple of dozen cities named Springfield,
-- each in a different state. I'd be surprised if this weren't true
-- in most countries.
create table city (
  country_code char(2) not null references country (country_code),
  name varchar(35) not null,
  population integer not null check (population > 0),
  primary key (country_code, name)
);

insert into city values 
('US', 'Rome, GA', 36303),
('US', 'Washington, DC', 632323),
('US', 'Springfield, VA', 30484),
('IT', 'Rome', 277979),
('IT', 'Milan', 1324110),
('IT', 'Bari', 320475),
('IN', 'Mumbai', 12478447),
('IN', 'Patna', 1683200),
('IN', 'Cuttack', 606007);

一个国家最大的人口。

select country.country_code, max(city.population) as max_population
from country
inner join city on country.country_code = city.country_code
group by country.country_code;

有几种方法可以使用它来获得您想要的结果。一种方法是在公用表表达式上使用内连接。

with max_population as (
  select country.country_code, max(city.population) as max_population
  from country
  inner join city on country.country_code = city.country_code
  group by country.country_code
)
select city.country_code, city.name, city.population
from city
inner join max_population 
        on max_population.country_code = city.country_code
       and max_population.max_population = city.population;

另一种方法是在子查询上使用内连接。（公用表表达式的文本“进入”主查询。使用别名“max_population”，查询不需要进一步更改即可工作。）

select city.country_code, city.name, city.population
from city
inner join (select country.country_code, max(city.population) as max_population
            from country
            inner join city on country.country_code = city.country_code
            group by country.country_code
           ) max_population 
        on max_population.country_code = city.country_code
       and max_population.max_population = city.population;

另一种方法是在子查询中使用窗口函数。您需要从子查询中进行选择，因为您不能在WHERE子句中直接使用rank（）的结果。也就是说，这可行。

select country_code, name, population
from (select country_code, name, population,
      rank() over (partition by country_code 
                   order by population desc) as city_population_rank
      from city
     ) city_population_rankings
where city_population_rank = 1;

但事实并非如此，即使乍一看它更有意义。

select country_code, name, population,
       rank() over (partition by country_code 
                    order by population desc) as city_population_rank
from city
where city_population_rank = 1;

ERROR:  column "city_population_rank" does not exist

Answer 2

最好的方法是PostgreSQL的最新版本是窗口。（Docs。）在你想要进入最终输出时需要做丑陋的事情之前特殊行的其他一些列，例如，具有最大总体的行。

WITH preliminary AS 
     (SELECT country_code, city_name, population,
      rank() OVER (PARTITION BY country_code ORDER BY population DESC) AS r
      FROM country
      NATURAL JOIN city) -- NATURAL JOIN collapses 2 country_code columns into 1
SELECT * FROM preliminary WHERE r=1;

在一个不可思议的情况下，一个国家的两个或更多大城市拥有完全相同的人口，这也做了一些明智的事情。

[编辑以回应评论]

在开窗之前，我通常的做法是

SELECT country_code, city_name, population
FROM country co1 NATURAL JOIN city ci1
WHERE ROW(co1.country_code, ci1.population) =
    (SELECT co2.country_code, ci2.population 
     FROM country co2 NATURAL JOIN city ci2
     WHERE co1.country_code = co2.country_code 
     ORDER BY population DESC LIMIT 1) 
     AS subquery;
-- note for lurkers, some other DBs use TOP 1 instead of LIMIT

这样做的表现并不是太糟糕，因为如果DB被智能地编入索引，Postgres会优化子查询。将此与Mike Sherrill的答案的子查询方法的内部联接进行比较。

请教师回答我们，对吗？使用到目前为止的设备，它可能效率低下，在连接情况下不完整，或两者兼而有之。

如何在postgresql中使用group by

2 个答案: