加入和分组时避免使用无用的子查询或聚合

时间:2014-02-06 16:56:19

标签: sql postgresql postgresql-9.2

我在聊天数据库中有两个表roommessage

CREATE TABLE room (
    id serial primary key,
    name varchar(50) UNIQUE NOT NULL,
    private boolean NOT NULL default false,
    description text NOT NULL
);

CREATE TABLE message (
    id bigserial primary key,
    room integer references room(id),
    author integer references player(id),
    created integer NOT NULL,
);

假设我想让房间里有来自用户的消息数量和最近消息的日期:

 id | number | last_created | description |      name        | private 
----+--------+--------------+-------------+------------------+---------
  2 |   1149 |   1391703964 |             | Dragons & co     | t
  8 |    136 |   1391699600 |             | Javascript       | f
 10 |     71 |   1391684998 |             | WBT              | t
  1 |     86 |   1391682712 |             | Miaou            | f
  3 |    423 |   1391681764 |             | Code & Baguettes | f
  ...

我看到两个解决方案:

1)选择/分组邮件并使用子查询获取会议室列:

select m.room as id, count(*) number, max(created) last_created,
(select name from room where room.id=m.room),
(select description from room where room.id=m.room),
(select private from room where room.id=m.room)
from message m where author=$1 group by room order by last_created desc limit 10

这使得3个几乎相同的子查询。这看起来很脏。我可以将它反转为在消息列上只做2个suqueries,但它不会好多了。

2)选择两个表并使用所有列的聚合函数:

select room.id, count(*) number, max(created) last_created,
max(name) as name, max(description) as description, bool_or(private) as private
from message, room
where message.room=room.id and author=$1
group by room.id order by last_created desc limit 10

所有这些聚合函数看起来都很乱,没用。

这里有干净的解决方案吗?

对我来说,这似乎是一个普遍的问题。从理论上讲,这些聚合函数是无用的,因为通过构造,所有连接的行都是同一行。我想知道是否有一般解决方案。

3 个答案:

答案 0 :(得分:3)

也许使用加入?

SELECT 
  r.id, count(*) number_of_posts, 
  max(m.created) last_created,
  r.name, r.description, r.private
FROM room r
JOIN message m on r.id = m.room
WHERE m.author = $1
GROUP BY r.id 
ORDER BY last_created desc

答案 1 :(得分:3)

尝试在子查询中执行分组:

select m.id, m.number, m.last_created, r.name, r.description, r.private
from (
    select m.room as id, count(*) number, max(created) last_created
    from message m 
    where author=$1 
    group by room 
) m
 join room r
   on r.id = m.id
order by m.last_created desc limit 10

编辑:另一个选项(可能具有相似的效果)是将该聚合移动到视图中,例如:

create view MessagesByRoom
as 
select m.author, m.room, count(*) number, max(created) last_created,
from message m 
group by author, room

然后使用它:

select m.room, m.number, m.last_created, r.name, r.description, r.private
from MessagesByRoom m
 join room r
   on r.id = m.room
where m.author = $1
order by m.last_created desc limit 10

答案 2 :(得分:1)

您可以在group by

中添加列
select room.id, count(*) number, max(message.created) last_created,
       room.name, room.description, room.private
from message join
     room
     on message.room=room.id and author=$1
group by room.id, name, description, private
order by last_created desc
limit 10;

编辑:

此查询适用于更新版本的Postgres:

select room.id, count(*) number, max(message.created) last_created,
       room.name, room.description, room.private
from message join
     room
     on message.room=room.id and author=$1
group by room.id
order by last_created desc
limit 10;

早期版本的documentation非常明确,您需要包含所有列:

  

当GROUP BY存在时,它对SELECT列表无效   用于引用未聚合列的表达式,但在聚合中除外   函数,因为将返回多个可能的值   对于未分组的列。

ANSI标准实际上只允许group by room.id的上述查询。这是对支持它的数据库功能的最新补充。