随机生成的数字有利于高数字?

时间:2016-06-03 11:46:11

标签: php random

我创建了一个随机获取唯一pool_id的函数:

function pick_random_pool($res)
{
    // getting random number between 0 and 1 with 9 decimals
    $rand = mt_rand(0, 1000000000) / 1000000000;

    $pick = 0;

    $pool_id = 0;

    for ($ii = 0; $ii < count($res); $ii++) {
        // if a randomly generated number is bigger keep updating picked id to the last one
        if ($rand > $res[$ii]->probability) {

            $pick = $res[$ii]->probability;
            $pool_id = $res[$ii]->ip_pools_probability_id;

            // if a randomly generated number reaches a number  more than itself, it stops
        } else if ($rand < $res[$ii]->probability) {
            // now we need to figure out which of two candidate numbers is the closest to a randomly generated number
            if (($res[$ii]->probability - $rand) < ($rand - $pick)) {
                $pool_id = $res[$ii]->ip_pools_probability_id;
            }

            break;
            // if by some chance a randomly generated number is exactly the same as a pool's probability
        } else {
            $pool_id = $res[$ii]->ip_pools_probability_id;
            break;

        }
    }

    return $pool_id
}

看起来它工作正常,但存在严重问题。它从$ res对象中获取最大数量。 90%的时间。例如,对于数组:

Array
(
[0] => stdClass Object
    (
        [ip_pools_probability_id] => 301
        [ip_pools_id] => 12
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.007166877
    )

[1] => stdClass Object
    (
        [ip_pools_probability_id] => 325
        [ip_pools_id] => 13
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.008621843
    )

[2] => stdClass Object
    (
        [ip_pools_probability_id] => 349
        [ip_pools_id] => 14
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.008984716
    )

[3] => stdClass Object
    (
        [ip_pools_probability_id] => 373
        [ip_pools_id] => 15
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.009268012
    )

[4] => stdClass Object
    (
        [ip_pools_probability_id] => 397
        [ip_pools_id] => 16
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.010412571
    )

...

[20] => stdClass Object
    (
        [ip_pools_probability_id] => 61
        [ip_pools_id] => 2
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.094612022
    )

[21] => stdClass Object
    (
        [ip_pools_probability_id] => 85
        [ip_pools_id] => 3
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.097033897
    )

[22] => stdClass Object
    (
        [ip_pools_probability_id] => 133
        [ip_pools_id] => 5
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.098855823
    )

[23] => stdClass Object
    (
        [ip_pools_probability_id] => 109
        [ip_pools_id] => 4
        [time_start] => 12:00:00
        [time_end] => 13:00:00
        [probability] => 0.099785568
    )

)

它将获得约23个元素。 90%的时间。为什么它不均匀分布?代码中的缺陷在哪里?

提前致谢。

1 个答案:

答案 0 :(得分:2)

概率的值都是0作为第一个小数。

0的 0 98855823

超过0.1的任何$rand(因此,90%)将高于任何概率。

编辑:实际上,如果你想根据你的可行性选择项目,你应该试试这个:

function pick_random_pool($pools)
{
    // getting random number between 0 and 1 with 9 decimals
    $rand = mt_rand(0, 1000000000) / 1000000000;
    foreach ($pools as $pool) {
        if ($pool->probability <= $rand) {
            return $pool->ip_pools_probability_id;
        } else {
            $rand -= $pool->probability;
        }
    }
    //maybe we want to return the last as fallback, if probabilites didnt sum up to 1
    return $pool->ip_pools_probability_id;

}