Question

假设我有以下假设数据结构：

create table "country"
(
  country_id integer,  
  country_name varchar(50),
  continent varchar(50),
  constraint country_pkey primary key (country_id)
);

create table "person"
(
  person_id integer,
  person_name varchar(100),
  country_id integer,
  constraint person_pkey primary key (person_id)
);

create table "event"
(
  event_id integer,
  event_desc varchar(100),
  country_id integer,
  constraint event_pkey primary key (event_id)
);

我想查询每个国家/地区的人数和事件数。我决定使用子查询。

select c.country_name, sum(sub1.person_count) as person_count, sum(sub2.event_count) as event_count
from
  "country" c
  left join (select country_id, count(*) as person_count from "person" group by country_id) sub1
    on (c.country_id=sub1.country_id)
  left join (select country_id, count(*) as event_count from "event" group by country_id) sub2
    on (c.country_id=sub2.country_id)
group by c.country_name

我知道你可以通过在字段列表中使用select语句来实现这一点，但使用子查询的优点是我可以更灵活地更改SQL以使其汇总并使用其他字段。假设我改变查询以便按大陆显示它，就像将字段“c.country_name”替换为“c.continent”一样简单。

我的问题是关于过滤。如果我们像这样添加一个where子句：

select c.country_name, 
  sum(sub1.person_count) as person_count, 
  sum(sub2.event_count) as event_count
from
  "country" c
  left join (select country_id, count(*) as person_count from "person" group by country_id) sub1
    on (c.country_id=sub1.country_id)
  left join (select country_id, count(*) as event_count from "event" group by country_id) sub2
    on (c.country_id=sub2.country_id)
where c.country_name='UNITED STATES'
group by c.country_name

子查询似乎仍在执行所有国家/地区的计数。假设person和event表很大，并且我已经在所有表的country_id上有索引。这真的很慢。数据库不应只执行已过滤的国家/地区的子查询吗？我是否必须为每个子查询重新创建国家过滤器（这非常繁琐且代码不易修改）？我顺便使用PostgreSQL 8.3和9.0，但我猜其他数据库也是如此。

Answer 1

数据库不应该只执行国家/地区的子查询被过滤了吗？

没有。像您这样的查询中的第一步似乎是从FROM子句中的所有表构造函数构建工作表。之后评估WHERE子句。

想象一下，如果sub1和sub2都是基表而不是子选择，你将如何做到这一点。它们都有两列，每个country_id都有一行。如果你想加入所有行，你就这样写。

from
  "country" c
  left join sub1 on (c.country_id=sub1.country_id)
  left join sub2 on (c.country_id=sub2.country_id)

但是如果你想在一行上加入，你就会写一些与之相当的东西。

from
  "country" c
  left join (select * from sub1 where country_id = ?)
    on (c.country_id=sub1.country_id)
  left join (select * from sub2 where country_id = ?)
    on (c.country_id=sub2.country_id)

帮助开发早期SQL标准的Joe Celko经常在Usenet上撰写how SQL's order of evaluation appears。

Answer 2

您可以使用country_id而非country_name过滤/分组行吗？我想你的名字上没有索引。
子查询不使用任何索引，因为您扫描所有表。如果要减少扫描，则应过滤数据。

为子查询的查询优化SQL“Where”子句

2 个答案: