计算分数/排名系统的分位数(PHP / MySQL)

时间:2014-09-15 14:19:06

标签: php mysql sql

有两个表:

用户

+----+-----------+
| id | user_name |
+----+-----------+
|  1 |   Alice   |
|  2 |   Steve   |
|  3 |   Tommy   |
+----+-----------+

结果

+----+---------+-------+-------------+
| id | user_id | score |  timestamp  |
+----+---------+-------+-------------+
|  1 |    1    |   22  |  1410793838 |
|  2 |    1    |   16  |  1410793911 |
|  3 |    2    |    9  |  1410793920 |
|  4 |    1    |   27  |  1410794007 |
|  5 |    3    |   32  |  1410794023 |
+----+---------+-------+-------------+

到目前为止我所拥有的是#3;前3",效果很好,看起来像这样:

SELECT MAX(m.score) AS score, u.user_name
FROM result AS r
INNER JOIN user AS u ON r.user_id = u.id
GROUP BY r.user_id
ORDER BY r.score DESC
LIMIT 3;

+-------+-----------+
| score | user_name |
+-------+-----------+
|   32  |   Tommy   |
|   27  |   Alice   |
|    9  |   Steve   |
+-------+-----------+

该表实际上填充了数百个结果,这只是一个例子。我正在寻找一种紧凑的算法来获得与%中所有其他用户相关的特定用户的排名。目标是输出类似于"你处于前5%/ 10%/ 20%/ 50%"或者"你低于平均水平"。虽然很容易确定某人是否低于平均水平(得分

3 个答案:

答案 0 :(得分:2)

如果我完全正确,那只是相对最大值计算:

SELECT
  user_name,
  MAX(score) AS max_score,
  CASE
    WHEN ROUND(100*MAX(score)/maximum, 2)>=95 THEN 'In top 5%'
    WHEN ROUND(100*MAX(score)/maximum, 2)>=90 THEN 'In top 10%'
    WHEN ROUND(100*MAX(score)/maximum, 2)>=75 THEN 'In top 25%'
    WHEN ROUND(100*MAX(score)/maximum, 2)>=50 THEN 'In top 50%'
    WHEN ROUND(100*MAX(score)/maximum, 2)>=0 THEN 'Below average'
  END AS score_mark
FROM
  `result`
    INNER JOIN `user`
      ON `result`.user_id=`user`.id
    CROSS JOIN
      (SELECT MAX(score) AS maximum FROM `result`) AS init
GROUP BY
  user_id

因此,从每个表的最高得分开始计算并将其分组给特定用户。查看fiddle

如下所述,这种计数方法涉及确定平均值的简单方法(即基于总最大值的所有计算方法)。这可能不是需要的东西。我的意思是,如果问题是根据其他分数计算相对位置(不是最大值) - 那么它就更复杂了:

  SELECT
    maxs.*,
    @num:=@num+1 AS order_num,
    CASE
      WHEN 100*(@num-1)/(user_count-1) <=   5 THEN 'In top 5%'
      WHEN 100*(@num-1)/(user_count-1) <=  10 THEN 'In top 10%'
      WHEN 100*(@num-1)/(user_count-1) <=  25 THEN 'In top 25%'
      WHEN 100*(@num-1)/(user_count-1) <=  50 THEN 'In top 50%'
      WHEN 100*(@num-1)/(user_count-1) <= 100 THEN 'Below average'
    END AS score_mark
  FROM
    (SELECT
      user_name,
      MAX(score) AS max_score
    FROM
      `result`
        INNER JOIN `user`
          ON `result`.user_id = `user`.id
    GROUP BY
      user_id
    ORDER BY
      max_score DESC) AS maxs
    CROSS JOIN
      (SELECT 
        @num:=0,
        COUNT(DISTINCT user_id) AS user_count
      FROM
        `result`) AS init

- 现在我们必须首先重新计算我们的位置,然后再建立相对计算。这是相应的fiddle。然而,在这里,我应用线性公式来计算第一个位置为&#34;零&#34;最后的位置是&#34; 100&#34;。如果这不是一个意图(会有边缘情况,例如&#34; 3&#34;在&#34; 50%&#34;对于&#34; 5总和&#34;在小提琴中) - 那么你可以将除数改为user_count

答案 1 :(得分:1)

这是另一个版本

    SELECT user_name, score,(CASE 
        WHEN score BETWEEN  @max-((@max-@min)/10) AND @max THEN  '10'
        WHEN score BETWEEN  @max-((@max-@min)/5) AND @max THEN  '20'
        WHEN score BETWEEN  @max-((@max-@min)/2) AND @max THEN  '50'
        ELSE 'more50' 
        END) as rangescore,
user_name 
FROM result r
INNER JOIN user u ON r.user_id = u.id,
(SELECT @max :=  MAX(score) FROM result)x,
(SELECT @min :=  MIN(score) FROM result)y
ORDER BY score DESC

如果您想比较用户的平均分数,可以使用AVG(score)代替MAX

如果您想要每个分数,请删除聚合函数和GROUP BY。

FIDDLE

FIDDLE GROUP

答案 2 :(得分:0)

好的,现在有了新的声明,我将为您提供另一种解决方案:

如你所说,你可以使用PHP和MySQL togheter,我会为你提供一个组合的。

你想要计算你的分位数(想知道这是quantiles on wikipedia),因为如果你有一个大约10000分的最佳射手而所有其他球员只有100分及以下,那么100分积分在前5%中作为球员,但得分低于最佳射手的50%。

考虑到这一点,我们可以计算球员的数量,以所需的球员百分比得分,并比较球员得分是否合适。

首先选择所有最大值,最小值,计数器等。

SELECT
 COUNT(`result`.`score`) `count`,
 MAX(`result`.`score`) `max`,
 MIN(`result`.`score`) `min`,
 AVG(`result`.`score`) `avg`
FROM
 `result`
GROUP BY
 `result`.`user_id`
ORDER BY
 `result`.`score` DESC

获得完整数据后,您可以计算分位数。

SELECT
 `result`.`score`
FROM
 `result`
GROUP BY
 `result`.`user_id`
ORDER BY
 `result`.`score` DESC
LIMIT FLOOR($count*$percent), 1
//where $count is the value from the first query and $percent is the wanted quantile e.g. 5%

之后你知道了分位数的值,你可以将实际值与这里的值进行比较。

//where $percentNN is the score from the previous query
if($score > $percent50) echo "top 50%";
if($score > $percent20) echo "top 20%";
if($score > $percent10) echo "top 10%";
if($score > $percent5) echo "top 5%";

也许,我们可以将多个查询合并为一个。