Question

我有以下架构：

CREATE TABLE author (
    id   integer
  , name varchar(255)
);
CREATE TABLE book (
    id        integer
  , author_id integer
  , title     varchar(255)
  , rating    integer
);

我希望每个作者都有它的最后一本书：

SELECT book.id, author.id, author.name, book.title as last_book
FROM author
JOIN book book ON book.author_id = author.id

GROUP BY author.id
ORDER BY book.id ASC

显然你可以在mysql中做到这一点：Join two tables in MySQL, returning just one row from the second table。

但是postgres给出了这个错误：

错误：列“book.id”必须出现在GROUP BY子句中或使用在一个聚合函数中：SELECT book.id，author.id，author.name， book.title as last_book FROM author JOIN book book ON book.author_id = author.id GROUP BY author.id ORDER BY book.id ASC

It's because：

当GROUP BY存在时，它对SELECT列表无效用于引用未聚合列的表达式，但在聚合中除外函数，因为将返回多个可能的值对于未分组的列。

如何指定postgres：“在joined_table.id订购时，只在连接表中仅给我最后一行？”

编辑：有了这些数据：

INSERT INTO author (id, name) VALUES
  (1, 'Bob')
, (2, 'David')
, (3, 'John');

INSERT INTO book (id, author_id, title, rating) VALUES
  (1, 1, '1st book from bob', 5)
, (2, 1, '2nd book from bob', 6)
, (3, 1, '3rd book from bob', 7)
, (4, 2, '1st book from David', 6)
, (5, 2, '2nd book from David', 6);

我应该看到：

book_id author_id name    last_book
3       1         "Bob"   "3rd book from bob"
5       2         "David" "2nd book from David"

Answer 1

select distinct on (author.id)
    book.id, author.id, author.name, book.title as last_book
from
    author
    inner join
    book on book.author_id = author.id
order by author.id, book.id desc

检查distinct on

SELECT DISTINCT ON（expression [，...]）仅保留给定表达式求值的每组行的第一行。使用与ORDER BY相同的规则解释DISTINCT ON表达式（参见上文）。请注意，除非使用ORDER BY确保首先显示所需的行，否则每个集合的“第一行”都是不可预测的。

有了明确的内容，有必要在order by中加入“不同”列。如果那不是你想要的订单，那么你需要包装查询并重新排序

select 
    *
from (
    select distinct on (author.id)
        book.id, author.id, author.name, book.title as last_book
    from
        author
        inner join
        book on book.author_id = author.id
    order by author.id, book.id desc
) authors_with_first_book
order by authors_with_first_book.name

另一种解决方案是使用Lennart的答案中的窗口函数。另一个非常通用的是这个

select 
    book.id, author.id, author.name, book.title as last_book
from
    book
    inner join
    (
        select author.id as author_id, max(book.id) as book_id
        from
            author
            inner join
            book on author.id = book.author_id
        group by author.id
    ) s
    on s.book_id = book.id
    inner join
    author on book.author_id = author.id

Answer 2

这可能看起来过时且过于简单，但它不依赖于窗口函数，CTE和聚合子查询。在大多数情况下，它也是最快的。

SELECT bk.id, au.id, au.name, bk.title as last_book
FROM author au
JOIN book bk ON bk.author_id = au.id
WHERE NOT EXISTS (
    SELECT *
    FROM book nx
    WHERE nx.author_id = bk.author_id
    AND nx.book_id > bk.book_id
    )
ORDER BY book.id ASC
    ;

Answer 3

我对聊天系统做了类似的事情，聊天室中保存着元数据，列表中包含消息。我最终使用了Postgresql LATERAL JOIN，它像一种魅力一样工作。

SELECT MR.id AS room_id, MR.created_at AS room_created, 
    lastmess.content as lastmessage_content, lastmess.datetime as lastmessage_when
FROM message.room MR
    LEFT JOIN LATERAL (
        SELECT content, datetime
        FROM message.list
        WHERE room_id = MR.id
        ORDER BY datetime DESC 
        LIMIT 1) lastmess ON true
ORDER BY lastmessage_when DESC NULLS LAST, MR.created_at DESC

有关更多信息，请参见https://heap.io/blog/engineering/postgresqls-powerful-new-join-type-lateral

Answer 4

这是一种方式：

SELECT book_id, author_id, author_name, last_book
FROM (
    SELECT b.id as book_id
         , a.id as author_id
         , a.name as author_name
         , b.title as last_book
         , row_number() over (partition by a.id
                              order by b.id desc) as rn
    FROM author a
    JOIN book b 
        ON b.author_id = a.id
) last_books
WHERE rn = 1;

Answer 5

您可以将规则添加到联接中，以仅指定一行。我为我工作。

赞：

SELECT 
    book.id, 
    author.id, 
    author.name, 
    book.title as last_book
FROM author auth1
JOIN book book ON (book.author_id = auth1.id AND book.id = (select max(b.id) from book b where b.author_id = auth1))
GROUP BY auth1.id
ORDER BY book.id ASC

通过这种方式，您可以从具有较高ID的书中获取数据。您可以添加“日期”并与max（date）相同。

Answer 6

作为@wildplasser建议的一个细微变化，它仍然适用于各种实现，你可以使用max而不是not exists。如果你喜欢短连接而不是长的子句

，那么读起来会更好

select * 
  from author au
  join (
    select max(id) as max_id, author_id
      from book bk
     group by author_id) as lb 
    on lb.author_id = au.id
  join bk 
    on bk.id = lb.max_id;

或者，为子查询命名，澄清事情，请使用WITH

with last_book as 
   (select max(id) as max_id, author_id
      from book bk
     group by author_id)

select * 
  from author au
  join last_book lb
    on au.id = lb.author_id
  join bk 
    on bk.id = lb.max_id;

Answer 7

create temp table book_1 as (
SELECT
id
,title
,author_id
,row_number() OVER (PARTITION BY id) as rownum 
FROM
book)  distributed by ( id );

select author.id,b.id, author.id, author.name, b.title as last_book
from
    author

    left  join
   (select * from  book_1 where rownum = 1 ) b on b.author_id = author.id
order by author.id, b.id desc

如何只用postgres连接连接表中的一行？

7 个答案: