这是另一个问题,
Left join and only last row from right
我尝试使用相同的策略撰写第二个联接。我在我的dev macbook pro上运行postgres 9.1.4。在这里看到这个简化的例子:
SELECT * FROM (
SELECT
post.*,
comment.*,
edit.*,
ROW_NUMBER() OVER (PARTITION BY post.id ORDER BY edit.date_applied DESC) AS rna,
ROW_NUMBER() OVER (PARTITION BY post.id ORDER BY comment.date_posted DESC) AS rnb
FROM
post
LEFT JOIN edit
ON post.id = edit.post_id
LEFT JOIN comment
ON post.id = comment.post_id
ORDER BY
post.id DESC
) AS q
WHERE rna = 1 AND rnb = 1;
所以我想要做的就是使用最新的编辑和最新的评论来提取所有帖子。在我的数据库中,大约有6000个帖子,每篇帖子大约有100条评论,每篇文章可能有10条评论。
现在,如果我使用其中一个连接而不是两个连接运行查询,则查询运行得非常快(在一分钟之内,并不像我想的那么快)。但是,如果我像上面提到的那样运行查询,postgres会在我的SSD上咀嚼剩余的14 GB,并在大约5分钟后放弃。
任何人都可以解释为什么会这样吗?我希望我对PARTITION BY子句缺乏了解。从SELECT子句中删除连接表并在子查询和外部查询中将LIMIT添加到两者都没有改变任何内容。
感谢阅读。
答案 0 :(得分:1)
问题可能是您在一个帖子ID中获得了笛卡尔积。例如,如果您有100次编辑和100条评论,那么由于加入,您最终会得到10,000行。
解决方案是在子查询中执行row_number()
:
SELECT post.*, comment.*, edit.*
FROM
post
LEFT JOIN (select e.*,
ROW_NUMBER() OVER (PARTITION BY post_id ORDER BY e.date_applied DESC) AS rna
from edit e
) edit
ON post.id = edit.post_id and rna = 1
LEFT JOIN (select c.*,
ROW_NUMBER() OVER (PARTITION BY post_id ORDER BY c.date_posted DESC) AS rnb
from comment c
) comment
ON post.id = comment.post_id and rnb = 1
ORDER BY
post.id DESC
答案 1 :(得分:1)
进行查询的另一种方式是Gordon Linoff写道:
SELECT post.*, comment.*, edit.*
FROM
post
LEFT JOIN (SELECT DISTINCT ON (e.post_id) e.*
FROM edit e
ORDER BY e.post_id DESC, e.date_applied DESC
) edit
ON post.id = edit.post_id
LEFT JOIN (SELECT DISTINCT ON (c.post_id) c.*
FROM comment c
ORDER BY c.post_id DESC, c.date_posted DESC
) comment
ON post.id = comment.post_id
ORDER BY
post.id DESC
您的数据可能(或可能不会)更快。你必须测试它。