如何只用postgres连接连接表中的一行?

时间:2014-06-04 16:05:42

标签: sql postgresql join

我有以下架构:

CREATE TABLE author (
    id   integer
  , name varchar(255)
);
CREATE TABLE book (
    id        integer
  , author_id integer
  , title     varchar(255)
  , rating    integer
);

我希望每个作者都有它的最后一本书:

SELECT book.id, author.id, author.name, book.title as last_book
FROM author
JOIN book book ON book.author_id = author.id

GROUP BY author.id
ORDER BY book.id ASC

显然你可以在mysql中做到这一点:Join two tables in MySQL, returning just one row from the second table

但是postgres给出了这个错误:

  

错误:列“book.id”必须出现在GROUP BY子句中或使用   在一个聚合函数中:SELECT book.id,author.id,author.name,   book.title as last_book FROM author JOIN book book ON book.author_id =   author.id GROUP BY author.id ORDER BY book.id ASC

It's because

  

当GROUP BY存在时,它对SELECT列表无效   用于引用未聚合列的表达式,但在聚合中除外   函数,因为将返回多个可能的值   对于未分组的列。

如何指定postgres:“在joined_table.id订购时,只在连接表中仅给我最后一行?”


编辑: 有了这些数据:

INSERT INTO author (id, name) VALUES
  (1, 'Bob')
, (2, 'David')
, (3, 'John');

INSERT INTO book (id, author_id, title, rating) VALUES
  (1, 1, '1st book from bob', 5)
, (2, 1, '2nd book from bob', 6)
, (3, 1, '3rd book from bob', 7)
, (4, 2, '1st book from David', 6)
, (5, 2, '2nd book from David', 6);

我应该看到:

book_id author_id name    last_book
3       1         "Bob"   "3rd book from bob"
5       2         "David" "2nd book from David"

7 个答案:

答案 0 :(得分:37)

select distinct on (author.id)
    book.id, author.id, author.name, book.title as last_book
from
    author
    inner join
    book on book.author_id = author.id
order by author.id, book.id desc

检查distinct on

  

SELECT DISTINCT ON(expression [,...])仅保留给定表达式求值的每组行的第一行。使用与ORDER BY相同的规则解释DISTINCT ON表达式(参见上文)。请注意,除非使用ORDER BY确保首先显示所需的行,否则每个集合的“第一行”都是不可预测的。

有了明确的内容,有必要在order by中加入“不同”列。如果那不是你想要的订单,那么你需要包装查询并重新排序

select 
    *
from (
    select distinct on (author.id)
        book.id, author.id, author.name, book.title as last_book
    from
        author
        inner join
        book on book.author_id = author.id
    order by author.id, book.id desc
) authors_with_first_book
order by authors_with_first_book.name

另一种解决方案是使用Lennart的答案中的窗口函数。另一个非常通用的是这个

select 
    book.id, author.id, author.name, book.title as last_book
from
    book
    inner join
    (
        select author.id as author_id, max(book.id) as book_id
        from
            author
            inner join
            book on author.id = book.author_id
        group by author.id
    ) s
    on s.book_id = book.id
    inner join
    author on book.author_id = author.id

答案 1 :(得分:5)

这可能看起来过时且过于简单,但它不依赖于窗口函数,CTE和聚合子查询。在大多数情况下,它也是最快的。

SELECT bk.id, au.id, au.name, bk.title as last_book
FROM author au
JOIN book bk ON bk.author_id = au.id
WHERE NOT EXISTS (
    SELECT *
    FROM book nx
    WHERE nx.author_id = bk.author_id
    AND nx.book_id > bk.book_id
    )
ORDER BY book.id ASC
    ;

答案 2 :(得分:4)

我对聊天系统做了类似的事情,聊天室中保存着元数据,列表中包含消息。我最终使用了Postgresql LATERAL JOIN,它像一种魅力一样工作。

SELECT MR.id AS room_id, MR.created_at AS room_created, 
    lastmess.content as lastmessage_content, lastmess.datetime as lastmessage_when
FROM message.room MR
    LEFT JOIN LATERAL (
        SELECT content, datetime
        FROM message.list
        WHERE room_id = MR.id
        ORDER BY datetime DESC 
        LIMIT 1) lastmess ON true
ORDER BY lastmessage_when DESC NULLS LAST, MR.created_at DESC

有关更多信息,请参见https://heap.io/blog/engineering/postgresqls-powerful-new-join-type-lateral

答案 3 :(得分:3)

这是一种方式:

SELECT book_id, author_id, author_name, last_book
FROM (
    SELECT b.id as book_id
         , a.id as author_id
         , a.name as author_name
         , b.title as last_book
         , row_number() over (partition by a.id
                              order by b.id desc) as rn
    FROM author a
    JOIN book b 
        ON b.author_id = a.id
) last_books
WHERE rn = 1;

答案 4 :(得分:2)

您可以将规则添加到联接中,以仅指定一行。 我为我工作。

赞:

SELECT 
    book.id, 
    author.id, 
    author.name, 
    book.title as last_book
FROM author auth1
JOIN book book ON (book.author_id = auth1.id AND book.id = (select max(b.id) from book b where b.author_id = auth1))
GROUP BY auth1.id
ORDER BY book.id ASC

通过这种方式,您可以从具有较高ID的书中获取数据。 您可以添加“日期”并与max(date)相同。

答案 5 :(得分:0)

作为@wildplasser建议的一个细微变化,它仍然适用于各种实现,你可以使用max而不是not exists。如果你喜欢短连接而不是长的子句

,那么读起来会更好
select * 
  from author au
  join (
    select max(id) as max_id, author_id
      from book bk
     group by author_id) as lb 
    on lb.author_id = au.id
  join bk 
    on bk.id = lb.max_id;

或者,为子查询命名,澄清事情,请使用WITH

with last_book as 
   (select max(id) as max_id, author_id
      from book bk
     group by author_id)

select * 
  from author au
  join last_book lb
    on au.id = lb.author_id
  join bk 
    on bk.id = lb.max_id;

答案 6 :(得分:0)

create temp table book_1 as (
SELECT
id
,title
,author_id
,row_number() OVER (PARTITION BY id) as rownum 
FROM
book)  distributed by ( id );

select author.id,b.id, author.id, author.name, b.title as last_book
from
    author

    left  join
   (select * from  book_1 where rownum = 1 ) b on b.author_id = author.id
order by author.id, b.id desc