是否有一种优雅的方法来计算给定间隔/步长的数组值以生成直方图数据?

时间:2014-08-11 16:30:23

标签: php arrays count histogram

我想获得以下数组的直方图数据。 array_count_values()会很好用,除了它只计算精确值匹配的值。我怎样才能优雅地做同样的事情,但是按给定的步骤或间隔捆绑值?

$dataArray = array(385,515,975,1136,2394,2436,4051,4399,4484,4768,4768,4849,4856,4954,5020,5020,5020,5020,5020,5020,5020,5020,5020,5052,5163,5200,5271,5421,5421,5442,5746,5765,5903,5992,5992,6046,6122,6205,6208,6239,6310,6360,6416,6512,6536,6543,6581,6609,6696,6699,6752,6796,6806,6855,6859,6886,6906,6911,6923,6953,7016,7072,7086,7089,7110,7232,7278,7293,7304,7309,7348,7367,7378,7380,7419,7453,7454,7492,7506,7549,7563,7721,7723,7731,7745,7750,7751,7783,7791,7813,7813,7814,7818,7833,7863,7875,7886,7887,7902,7907,7935,7942,7942,7948,7973,7995,8002,8013,8013,8015,8024,8025,8030,8038,8041,8050,8056,8060,8064,8071,8081,8082,8085,8093,8124,8139,8142,8167,8179,8204,8214,8223,8225,8247,8248,8253,8258,8264,8265,8265,8269,8277,8278,8289,8300,8312,8314,8323,8328,8334,8363,8369,8390,8397,8399,8399,8401,8436,8442,8456,8457,8471,8474,8483,8503,8511,8516,8533,8560,8571,8575,8583,8592,8593,8626,8635,8635,8644,8659,8685,8695,8695,8702,8714,8715,8717,8729,8732,8740,8743,8750,8756,8772,8772,8778,8797,8828,8840,8840,8843,8856,8865,8874,8876,8878,8885,8887,8893,8896,8905,8910,8955,8970,8971,8991,8995,9014,9016,9042,9043,9063,9069,9104,9106,9107,9116,9131,9157,9227,9359,9471);
// if array_count_values accepted a step value I could do this:
print_r(array_count_values($dataArray,1000));

// expected result:
// array(1000  =>  3, 2000 => 1, ... 10000 => 15);
//       ^0-1000   ^[385,515,975]

你推荐什么?

如果我必须手动遍历所有值,是否有一种优雅的方法可以将所有值舍入到给定的间隔?

4 个答案:

答案 0 :(得分:3)

$step = 1000;
$result = array_count_values(
    array_map(
        function ($value) use ($step) {
            return (int) ceil($value / $step) * $step;
        },
        $dataArray
    )
);
var_dump($result);

答案 1 :(得分:1)

四舍五入的解决方案看起来非常简单:

$step_size = 10;
$data = array(10, 20, 24, 30, 35, 50);
foreach ($data as $index => $value) {
    $data[$index] = round($value / $step_size) * $step_size;
}
// array(10, 20, 20, 30, 40, 50);

答案 2 :(得分:1)

您可以直接构建输出以避免映射整个数据数组只是为了使用array_count_values();下面是一个更通用的实现,它允许映射在函数本身之外完成:

function array_count_values_callback(array $data, callable $fn)
{
    $freq = [];

    foreach ($data as $item) {
        $key = $fn($item);
        $freq[$key] = isset($freq[$key]) ? $freq[$key] + 1 : 1;
    }

    return $freq;
}

print_r(array_count_values_callback($dataArray, function($item) {
    return ceil($item / 1000) * 1000;
}));

答案 3 :(得分:1)

这是一个简单的解决方案,您可以循环浏览$dataArray

$step_size = 1000;
$histogramArray = array();
foreach ($dataArray as $v) {
    $k = (int)ceil($v / $step_size) * $step_size;
    if (!array_key_exists($k, $histogramArray)) $histogramArray[$k] = 0;
    $histogramArray[$k]++;
}

输出将是,

Array
(
    [1000] => 3
    [2000] => 1
    [3000] => 2
    [5000] => 8
    [6000] => 21
    [7000] => 25
    [8000] => 46
    [9000] => 110
    [10000] => 15
)