我有一个如下所示的Messages表:
Messages
+-----+------------+-------------+--------------+
| id | sender_id | receiver_id | created_at |
+-----------------------------------------------+
| 1 | 1 | 2 | 1/1/2013 |
| 2 | 1 | 2 | 1/1/2013 |
| 3 | 2 | 1 | 1/2/2013 |
| 4 | 3 | 2 | 1/2/2013 |
| 5 | 3 | 2 | 1/3/2013 |
| 6 | 5 | 4 | 1/4/2013 |
+-----------------------------------------------+
如果'thread'是给定sender_id和receiver_id之间的一组消息,我希望查询返回最近10个消息的最新10条消息其中sender_id或receiver_id是给定的身份。
给定user_id为5的预期输出:
+-----+------------+-------------+--------------+
| id | sender_id | receiver_id | created_at |
+-----------------------------------------------+
| 1 | 5 | 2 | 1/4/2013 |
| 2 | 5 | 2 | 1/4/2013 |
| 3 | 2 | 5 | 1/4/2013 |
| 4 | 3 | 5 | 1/4/2013 |
| 5 | 5 | 2 | 1/3/2013 |
| 6 | 5 | 4 | 1/3/2013 |
+-----------------------------------------------+
例如,用户5和2(上面有4个)和10个线程(上面有3个)之间的最多10个消息的限制。
我一直在尝试使用子查询进行此类查询,但未设法获得不同线程数量的第二个限制。
SELECT * FROM (SELECT DISTINCT ON (sender_id, receiver_id) messages.*
FROM messages
WHERE (receiver_id = 5 OR sender_id = 5) ORDER BY sender_id, receiver_id,
created_at DESC)
q ORDER BY created_at DESC
LIMIT 10 OFFSET 0;
我正在考虑创建一个包含thread_id字段的新Thread表,该字段将是sender_id + receiver_id的串联,然后只是加入Messages,但我有一种偷偷摸摸的怀疑,它只能用一个表来实现。
答案 0 :(得分:2)
我可以想象在一个查询中解决您的问题的最整洁的查询是以下一个:
select * from (
select row_number()
over (partition by sender_id, receiver_id order by created_at desc) as rn, m.*
from Messages m
where (m.sender_id, m.receiver_id) in (
select sender_id, receiver_id
from Messages
where sender_id = <id> or receiver_id = <id>
group by sender_id, receiver_id
order by max(created_at) desc
limit 10 offset 0
)
) res where res.rn <= 10
row_number() over (partition by sender_id, receiver_id order by created_at desc)
列将包含每个线程中每条消息的行号(如果您运行单独的查询以仅查询一个线程,它将类似于记录号)。除了这个行号之外,如果它包含在10个最顶层的线程中(由(m.sender_id, m.receiver_id) in ...query...
创建),你可以查询消息本身。最后,因为你只需要10个最顶层的消息,你可以将行号限制为更低或相等到10。
答案 1 :(得分:2)
我建议接受couling的回答并略微修改它,以便它使用公用表表达式提供有效的两个查询:
WITH threads (sender_id, receiver_id, latest) as (
select sender,
receiver,
max(sent)
from sof_messages
where receiver = <user>
or sender = <user>
group by sender,
receiver
order by 3
limit 10
),
messages ([messages fields listed here], rank) as (
select m.*,
rank() over (partition by (sender, receiver), order by sent desc)
from sof_messages
WHERE (sender, receiver) in (select (sender, receiver) from threads))
SELECT * from messages where rank <= 10;
这样做的好处是可以让规划人员在这里很好地了解何时使用索引。实质上,查询的三个部分中的每个部分都是独立计划的。
答案 2 :(得分:1)
我发布此内容以显示可以执行的操作。
我真的不建议使用它。
执行两个单独的查询会好得多:1检索10个最近的线程,1个重复读取每个线程的10个最新消息。
但是,您可以使用rank()
window function实现目标,如下所示。
select * from (
select message.*,
rank() over (partition by message.sender, message.receiver
order by sent desc )
from sof_messages message,
(
select sender,
receiver,
max(sent)
from sof_messages
where receiver = <user>
or sender = <user>
group by sender,
receiver
order by 3
limit 10
) thread
where message.sender = thread.sender
and message.receiver = thread.receiver
) message_list
where rank <= 10
有几个不同的查询将通过窗口函数实现您的目标,其中没有一个特别干净。
答案 3 :(得分:1)
由于数据重复,创建Thread
表看起来不对,但视图可能会有所帮助:
CREATE VIEW threads AS
SELECT sender_id, receiver_id, min(created_at) AS t_date
FROM messages
GROUP BY sender_id,receiver_id;
如果帖子的日期是其最新消息的日期而不是最早的消息,请将min(created_at)
更改为max(created_at)
。
然后可以使用以下命令将其连接回消息:
SELECT ... FROM messages JOIN threads USING (sender_id,receiver_id)
答案 4 :(得分:0)
我没有对此进行过测试,但看起来您忘记了子查询中的LIMIT 10
,它为您提供了10个最近的主题:
SELECT
*
FROM
(SELECT DISTINCT ON
(sender_id, receiver_id) messages.*
FROM
messages
WHERE
(receiver_id = 5 OR sender_id = 5)
ORDER BY
sender_id, receiver_id, created_at DESC
LIMIT
10)
q
ORDER BY
created_at DESC
LIMIT
10
OFFSET
0;
(我已经很好地打印了SQL,因此更容易分辨出发生了什么。)