boost :: accumulators :: statistics的中值输出令人困惑

时间:2019-03-17 07:00:33

标签: c++ boost statistics median

当我使用boost :: accumulators :: statistics计算数组的中位数时,我得到了以下代码和结果:

    accumulator_set< double, features< tag::mean, tag::median > > acc;
    acc(2);
    acc(1); 
    acc(3);
    value = mean( acc );   //output is 2, expected
    value = median( acc ); //output is 3, unexpected

我认为value = median( acc )的结果应为2。

1 个答案:

答案 0 :(得分:0)

accumulator_set实际上并不存储所有值。每次对acc(double)的调用实际上都会调用acc.mean_accumulator(double); acc.median_accumulator(double)之类的东西,并且会尝试 not 来存储所有值。

对于median,使用P ^ 2分位数估计量。 (See here)仅是一种估算,如果您这样做:

acc(4);
acc(1);
acc(2);
acc(0);
acc(3);

它返回预期的2

如果您想要一个确切的值并且数据值很少,请使用如下函数:

#include <algorithm>
#include <vector>

// Warning: Will swap elements in the range.
// `It` needs to be a non-const random access iterator
// (Like `T*`)
template<class It>
auto median(It first, It last) {
    auto size = last - first;
    if (size % 2 == 1U) {
        std::nth_element(first, first + (size / 2U), last);
        return *(first + (size / 2U));
    }
    std::nth_element(first, first + (size / 2U), last);
    auto&& high = first + (size / 2U);
    auto&& low = std::max(first, first + (size / 2U - 1U));
    return (*low + *high) / 2;
}

// Copies the range and modifies the copy instead
template<class It>
auto const_median(It first, It last) {
    std::vector<decltype(*first)> v(first, last);
    return median(v.begin(), v.end());
}

int main() {
    std::vector<double> v{2, 1, 3};
    std::cout << median(v.begin(), v.end()) << '\n';
}