Question

让我们说我想问用户一些整数然后我想对它们做一些算术运算（比如计算均值，模式等等）。收集数据以便将统计函数应用于数据的最佳和最有效的方法是什么？

Answer 1

从排序数据中更容易找到中位数和模式。

如果用户将输入数据，那么将数组插入数组是一个很好的选择，因为排序的工作将分散在所有条目上。

如果数据来自电子来源，如文件，可能最好全部阅读，然后排序。

无论您如何选择处理它，都可以将其存储在std::vector或std::deque中，因为这样可以有效利用具有良好缓存行为和高效随机访问的内存。

Answer 2

您可以＆＃34;收集数据＆＃34;使用std::istream - 具体来说，如果您需要标准输入（默认为键盘，或某些重定向/管道文件或命令输出），则使用std::cin，否则std::ifstream直接读取文件。例如：

double my_double;
if (!(std::cin >> my_double))
{
    std::cerr << "unable to read and parse a double from standard input\n";
    exit(1);
}
...use my_double...

用于存储值...最好以std::vector<double>开头：

std::vector<double> my_doubles;
my_doubles.push_back(my_double);

// add all the doubles...
double total = 0;
for (auto& d : my_doubles)
    total += d;

有关组合这些内容的示例：

// read/store all the numbers from the current position in the input stream...
while (std::cin >> my_double)
    my_doubles.push_back(my_double);

如果有用，您可以对容器进行排序：

std::sort(std::begin(my_doubles), std::end(my_doubles)); // default increasing

std::sort(std::begin(my_doubles), std::end(my_doubles), // decreasing
          [](double x, double y) { return x > y; });

对于其他容器类型，某些操作可能更容易，例如 - std::set<>是一种方便的方法，可以在拒绝重复值时保持值的排序，而std::multiset可以存储重复项。

Answer 3

使用Boost.Accumulators框架。

这是他们的首发例子：

#include <iostream>
#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics/stats.hpp>
#include <boost/accumulators/statistics/mean.hpp>
#include <boost/accumulators/statistics/moment.hpp>
using namespace boost::accumulators;

int main()
{
    // Define an accumulator set for calculating the mean and the
    // 2nd moment ...
    accumulator_set<double, stats<tag::mean, tag::moment<2> > > acc;

    // push in some data ...
    acc(1.2);
    acc(2.3);
    acc(3.4);
    acc(4.5);

    // Display the results ...
    std::cout << "Mean:   " << mean(acc) << std::endl;
    std::cout << "Moment: " << accumulators::moment<2>(acc) << std::endl;

    return 0;
}

框架定义了plethora of accumulators，通常提供所需统计操作的惰性和急切版本。正如您所料，它也是可扩展的。

保存一组整数的最佳方法

3 个答案: