为什么方差大于随机概率结果?

时间:2012-06-28 05:59:34

标签: php linux

为什么方差结果的方差足够大?

测试代码:

function probability($chances) {

    asort($chances);
    $sum    = array_sum($chances);
    $random = mt_rand(1, $sum);

    foreach($chances as $key => $chance) {
        if($random < $chance)
            return $key;
    }

    return $key;

}

$chances['case1'] = 10;
$chances['case2'] = 30;
$chances['case3'] = 60;

$result = array();

for($i = 0; $i < 100000; $i++)
    @$result[probability($chances)]++;

asort($result);
$sum = array_sum($result);

echo "Case\tCount\tOrig\tResult\n";

foreach($result as $key => $value)
    echo "$key\t$value\t".$chances[$key]."%\t".round($value / $sum * 100)."%\n";

结果:

Case    Count   Orig    Result
case1   14913   10%     15%
case2   33099   30%     33%
Case3   51988   60%     52%

有可能以某种方式调整它吗?我试图使用mt_srand(),但它无济于事。

的信息:

$ php -v
PHP 5.3.10-1ubuntu3.2 with Suhosin-Patch (cli) (built: Jun 13 2012 17:20:55) 
Copyright (c) 1997-2012 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2012 Zend Technologies
    with Xdebug v2.1.0, Copyright (c) 2002-2010, by Derick Rethans
    with Suhosin v0.9.33, Copyright (c) 2007-2012, by SektionEins GmbH

$ uname -a
Linux desktop 3.2.0-26-generic-pae #41-Ubuntu SMP Thu Jun 14 16:45:14 UTC 2012 i686 i686 i386 GNU/Linux

3 个答案:

答案 0 :(得分:2)

你的随机数生成有缺陷。

首先,请考虑删除asort电话。它没有做任何有用的事情,而且令人困惑(也很慢)。你正在对阵列进行100000次排序!最好添加数组排序的前提条件(并在循环之前对其进行排序)或实现不需要排序的算法。

其次,您需要确保每种情况下击中每个案例的概率都是正确的。这些是你现在的概率:

case1: 10 % (1 <= $random <= 10)
case2: 20 % (11 <= $random <= 30)
case3: 70 % (everything that didn't match previous cases)

你真正需要做的是这样的事情:

function probability($chances) {
    $sum    = array_sum($chances);
    $random = mt_rand(1, $sum);

    $add = 0;
    foreach($chances as $key => $chance) {
        if($random <= $chance + $add)
            return $key;
        else
            $add += $chance;
    }

    return $key;
}

这将为您提供预期的结果:

case1: 10 % (1 <= $random <= 10)
case2: 30 % (11 <= $random <= 40)
case3: 60 % (41 <= $random <= 100)

答案 1 :(得分:1)

$sum    = max($chances);

max()不会求和,请改用<{1}}

我得到了这个结果:

array_sum()

从运行此版本的代码开始:

Case    Count   Orig    Result
case1   11068   10%     11%
case2   29672   30%     30%
case3   59260   60%     59%

答案 2 :(得分:1)

  

首先,probability内的比较是错误的,它应该是<=而不是<

这至少应该使结果更加一致(即10,20,70)

  

其次,case3被重复计算(如果nr <= 60且如果nr> 60)。

我建议对代码进行此更改:

function probability($chances)
{
    $sum    = array_sum($chances);
    $random = mt_rand(1, $sum);

    foreach($chances as $key => $chance) {
        if ($random <= $chance) {
            return $key;
        }
    }

    return 'rest';
}

然后在$chances数组中添加'rest'。这必须按排序顺序出现。

$chances['case1'] = 10;
$chances['case2'] = 30;
$chances['case3'] = 60;
$chances['rest'] = 'NA'; // for 60 < x <= 100

结果:

Case    Count   Orig    Result
case1   10083   10%     10%
case2   19965   30%     20%
case3   30084   60%     30%
rest    39868   NA%     40%