每组最新N条记录的平均值

时间:2013-06-05 18:00:11

标签: mysql average greatest-n-per-group limit-per-group

我当前的应用程序根据每个用户的所有记录计算点数平均值:

SELECT `user_id`, AVG(`points`) AS pts 
FROM `players` 
WHERE `points` != 0 
GROUP BY `user_id`

业务要求已更改,我需要根据每个用户的最近30条记录计算平均值。

相关表格具有以下结构:

桌子:球员;列:player_id,user_id,match_id,points

表:用户; columns:user_id

以下查询不起作用,但它确实展示了我尝试实现的逻辑。

SELECT @user_id := u.`id`, (
    -- Calculate the average for last 30 records
    SELECT AVG(plr.`points`) 
    FROM (
        -- Select the last 30 records for evaluation
        SELECT p.`points` 
        FROM `players` AS p 
        WHERE p.`user_id`=@user_id 
        ORDER BY `match_id` DESC 
        LIMIT 30
    ) AS plr
) AS avg_points 
FROM `users` AS u

是否有一种相对有效的方法来根据每个用户的最新30条记录计算平均值?

5 个答案:

答案 0 :(得分:9)

没有理由重新发明轮子,并冒险你有一个错误的,次优的代码。您的问题是普通per group limit problem的简单扩展。已经有tested and optimized solutions to solve this problem,我建议从这个资源中选择以下两个解决方案。这些查询为每个玩家生成最新的30条记录(为您的表重写):

select user_id, points
from players
where (
   select count(*) from players as p
   where p.user_id = players.user_id and p.player_id >= players.player_id
) <= 30;

(只是为了确保我理解你的结构:我认为player_id是玩家表中的唯一键,并且一个用户可以作为多个玩家出现在此表中。)

第二个tested and optimized solution是使用MySQL变量:

set @num := 0, @user_id := -1;

select user_id, points,
      @num := if(@user_id = user_id, @num + 1, 1) as row_number,
      @user_id := user_id as dummy
from players force index(user_id) /* optimization */
group by user_id, points, player_id /* player_id should be necessary here */
having row_number <= 30;

第一个查询不是优化的(是二次的),而第二个查询是最优的(一次通过),但只能在MySQL中工作。这个选择由你。如果您选择第二种技术,请注意并使用您的密钥和数据库设置正确测试它; they suggest in some circumstances it might stop working

您的最终查询很简单:

select user_id, avg(points)
from ( /* here goes one of the above solutions; 
          the "set" commands should go before this big query */ ) as t
group by user_id

请注意,我没有将您在第一个查询(points != 0)中的条件合并,因为我不太了解您的要求(您没有对此进行描述),我也认为这个答案应该足够通用帮助其他有类似问题的人。

答案 1 :(得分:8)

试试这个:

SELECT user_id, AVG(points) AS pts 
FROM (SELECT user_id, IF(@uid = (@uid := user_id), @auto:=@auto + 1, @auto := 1) autoNo, points
      FROM players, (SELECT @uid := 0, @auto:= 1) A 
      WHERE points != 0 
      ORDER BY user_id, match_id DESC
     ) AS A 
WHERE autoNo <= 30
GROUP BY user_id;

答案 2 :(得分:0)

这应该有效:

SELECT p1.user_id, avg(points) as pts
  FROM players p1, (
    SELECT u.user_id, (
         SELECT match_id
           FROM players p2
          WHERE p2.user_id = u.user_id
          ORDER BY match_id DESC
          LIMIT 29, 1 ) mid
      FROM users u
    HAVING mid IS NOT NULL) m
 WHERE p1.user_id = m.user_id
   AND p1.match_id >= m.mid
 GROUP BY p1.user_id

 UNION ALL

SELECT user_id, avg(points) AS pts 
  FROM players
 GROUP BY user_id
HAVING count(*) < 30

只有在您需要包含少于30条记录的用户时才需要UNION ALL之后的部分。

答案 3 :(得分:0)

SELECT 
u.`id`, 
(SELECT AVG(p.`points`) FROM FROM `players` AS p WHERE p.`user_id`=u.`id` 
ORDER BY p.`user_id` DESC LIMIT 30) AS AVG
FROM `users` AS u Group by u.`id`

也试试这个......

答案 4 :(得分:0)

如果我理解你的逻辑,你需要根据最后30条记录(按match_id排序)计算每个用户的平均分数。

首先,您需要为每个用户返回最后30条记录,并且可以使用如下查询:

SELECT p.user_id, p.match_id, p.points
FROM
  players p INNER JOIN players c
  ON p.user_id=c.user_id AND p.match_id<=c.match_id
     AND p.points!=0 and c.points!=0
GROUP BY
  p.user_id, match_id, points
HAVING
  COUNT(c.user_id)<=30

然后你需要计算上一个查询的平均值:

SELECT user_id, AVG(points)
FROM (
  SELECT p.user_id, p.match_id, p.points
  FROM
    players p INNER JOIN players c
    ON p.user_id=c.user_id AND p.match_id<=c.match_id
       AND p.points!=0 and c.points!=0
  GROUP BY
    p.user_id, match_id, points
  HAVING
    COUNT(c.user_id)<=30
  ) l
GROUP BY user_id