MySQL-将字段限制为最多5次

时间:2018-11-20 18:56:42

标签: mysql sql

背景

我运行了一个平台,允许用户关注创作者并查看其内容。

以下查询成功显示了50个按受欢迎程度排序的帖子。还有其他一些逻辑不显示用户已经保存/删除的帖子,但这与该问题无关。

问题:

如果一位创作者特别受欢迎(popularity高),则返回的前50名帖子将几乎所有由该创作者担任。

这会使结果产生偏差,因为理想情况下,返回的50个帖子不会偏向于某个特定作者。

问题:

如何限制它,以使作者(使用字段posted_by)返回的次数不超过 5次。可能要少一些,但绝对不能超过5次。

它仍然应该由popularity DESC最终订购

SELECT * 
FROM   `source_posts` 
WHERE  `posted_by` IN (SELECT `username` 
                       FROM   `source_accounts` 
                       WHERE  `id` IN (SELECT `sourceid` 
                                       FROM   `user_source_accounts` 
                                       WHERE  `profileid` = '100')) 
       AND `id` NOT IN (SELECT `postid` 
                        FROM   `user_posts_removed` 
                        WHERE  `profileid` = '100') 
       AND `live` = '1' 
       AND `added` >= Date_sub(Now(), INTERVAL 1 month) 
       AND `popularity` > 1 
ORDER  BY `popularity` DESC 
LIMIT  50 

谢谢。

修改

我正在使用MySQL版本5.7.24,因此不幸的是row_number()函数在此实例中不起作用。

2 个答案:

答案 0 :(得分:1)

在MySQL 8+中,您只需使用row_number()

select sp.*
from (select sp.*,
             row_number() over (partition by posted_by order by popularity desc) as seqnum
      from source_posts sp
     ) sp
where seqnum <= 5
order by popularity desc
limit 50;

我不确定您查询的其余部分在做什么,因为您的问题中没有描述。当然,您可以添加其他过滤条件或join

编辑:

在早期版本中,您可以使用变量:

select sp.*
from (select sp.*,
             (@rn := if(@p = posted_by, @rn + 1,
                        if(@p := posted_by, 1, 1)
                       )
             ) as rn
      from (select sp.*
            from source_posts sp
            order by posted_by, popularity desc
           ) sp cross join
           (select @p := '', @rn := 0) params
     ) sp
where rn <= 5
order by popularity desc
limit 50;

答案 1 :(得分:1)

可以尝试行号功能。使用它,它将为每个员工分配一个不同的“ id”。因此,如果一个员工有50条记录,则仅返回row_number(名为“ rank”)小于或等于5的记录。

Select *
from(   
 SELECT `source_posts.*`, row_number() over (partition by `username` order by `popularity` desc) as rank
    FROM   `source_posts` 
    WHERE  `posted_by` IN (SELECT `username` 
                           FROM   `source_accounts` 
                           WHERE  `id` IN (SELECT `sourceid` 
                                           FROM   `user_source_accounts` 
                                           WHERE  `profileid` = '100')) 
           AND `id` NOT IN (SELECT `postid` 
                            FROM   `user_posts_removed` 
                            WHERE  `profileid` = '100') 
           AND `live` = '1' 
           AND `added` >= Date_sub(Now(), INTERVAL 1 month) 
           AND `popularity` > 1 
    ORDER  BY `popularity` DESC 
    LIMIT  50 `enter code here`)
where rank <= 5