在尝试进行查询时,我按照指定的代码提取前1000个帖子(按time_spent
),我想出了以下查询,其中1, 2, 3
是指定的标签:
SELECT g.tagid, e.post_id, SUM(e.time_spent) AS time
FROM post_table e
JOIN (SELECT g.postid, g.tagid
FROM tags_table g
WHERE g.tagid IN (1, 2, 3)) g
ON e.post_id = g.postid
WHERE dt >= '2018-06-01'
GROUP BY g.tagid, e.post_id
ORDER BY time DESC
LIMIT 1000
但是,在这里使用LIMIT 1000
的问题是它限制了整个组并使其成为总共1000个结果,而不是为每个标记1,标记2和标记获得1000个结果3(即总共3000个结果)。
如何修改此查询,使LIMIT
仅出现在e.post_id
的{{1}}组件上?或者,是否有另一种方法可以为GROUP BY
子句中指定的每个标记获得1000个结果?
答案 0 :(得分:1)
使用row_number()
:
SELECT ge.*
FROM (SELECT g.tagid, e.post_id, SUM(e.time_spent) AS time,
ROW_NUMBER() OVER (PARTITION BY g.tagid ORDER BY SUM(e.time_spent) ) as seqnum
FROM post_table e JOIN
tags_table g
ON e.post_id = g.postid
WHERE g.tagid IN (1, 2, 3) AND dt >= '2018-06-01'
GROUP BY g.tagid, e.post_id
) ge
WHERE seqnum <= 1000
ORDER BY t.tagid, time DESC