Question

我正在尝试应用Bayesian rating formula，但如果我对5千个中的1个进行评分，则最终评分大于5。

例如，一个给定的项目没有投票，在用1星投票170,000次后，其最终评级为5.23。如果我评分为100，则它具有正常值。

以下是我在PHP中的内容。

<?php
// these values came from DB
$total_votes     = 2936;    // total of votes for all items
$total_rating    = 582.955; // sum of all ratings
$total_items     = 202;

// now the specific item, it has no votes yet
$this_num_votes  = 0;
$this_score      = 0;
$this_rating     = 0;

// simulating a lot of votes with 1 star
for ($i=0; $i < 170000; $i++) { 
    $rating_sent = 1; // the new rating, always 1

    $total_votes++; // adding 1 to total
    $total_rating = $total_rating+$rating_sent; // adding 1 to total

    $avg_num_votes = ($total_votes/$total_items); // Average number of votes in all items
    $avg_rating = ($total_rating/$total_items);   // Average rating for all items
    $this_num_votes = $this_num_votes+1;          // Number of votes for this item
    $this_score = $this_score+$rating_sent;       // Sum of all votes for this item
    $this_rating = $this_score/$this_num_votes;   // Rating for this item

    $bayesian_rating = ( ($avg_num_votes * $avg_rating) + ($this_num_votes * $this_rating) ) / ($avg_num_votes + $this_num_votes);
}
echo $bayesian_rating;
?>

即使我用1或2：

$rating_sent = rand(1,2)

100,000票后的最终评分超过5。

我刚刚使用

进行了一项新测试

$rating_sent = rand(1,5)

在100,000之后我得到一个完全超出范围范围的值（10.53）。我知道在正常情况下，没有项目将获得170,000票，而所有其他项目都没有投票。但我想知道我的代码是否有问题，或者这是考虑到大量投票的贝叶斯公式的预期行为。

修改

为了说清楚，这里有一些变量的更好解释。

$avg_num_votes   // SUM(votes given to all items)/COUNT(all items)
$avg_rating      // SUM(rating of all items)/COUNT(all items)
$this_num_votes  // COUNT(votes given for this item)
$this_score      // SUM(rating for this item)
$bayesian_rating // is the formula itself

公式为：( (avg_num_votes * avg_rating) + (this_num_votes * this_rating) ) / (avg_num_votes + this_num_votes)。取自here

Answer 1

在计算avg_rating时，您需要除以total_votes而不是total_items。

我做了这些改变并得到了一些在这里表现得更好的东西。

http://codepad.org/gSdrUhZ2

泛滥贝叶斯评级会使价值超出范围

1 个答案: