c ++直方图bin排序

时间:2013-03-04 23:24:59

标签: c++ histogram bin

我正在编写一个函数来克隆Excell中数据分析加载项的直方图功能。基本上,提供样本数据的输入,然后也提供bin范围。 bin范围必须单调递增,在我的情况下,需要具体[0 20 40 60 80 100]。如果样本大于下限(左边缘)且小于或等于上限(右边缘),Excell将计算样本是否落入bin范围。

我在下面编写了bin排序算法,它为data0(非常接近)提供了不正确的输出,但是data1和data2的输出正确。在这种情况下,正确意味着此算法的输出完全匹配Excell生成的表中的输出,其中样本数在bin旁边计算。任何帮助表示赞赏!

#include <iostream>

int main(int argc, char **agv)
{
    const int SAMPLE_COUNT      = 21;
    const int BIN_COUNT         = 6;
    int binranges[BIN_COUNT]    = {0, 20, 40, 60, 80, 100};
    int bins[BIN_COUNT]         = {0, 0, 0, 0, 0, 0};

    int data0[SAMPLE_COUNT] =  {4,82,49,17,89,73,93,86,74,36,74,55,81,61,88,94,72,65,35,25,79};
    // for data0 excell's bins read:
    // 0    0
    // 20   2
    // 40   3
    // 60   2
    // 80   7
    // 100  7
    //
    // instead output of bins is: 203277

    int data1[SAMPLE_COUNT] = {88,83,0,0,95,86,0,94,92,77,94,73,93,90,50,95,93,83,0,95,91};
    //for data1 excell and this algorithm both yield:
    // 0    4
    // 20   0
    // 40   0
    // 60   1
    // 80   2
    // 100  14  (correct)

    int data2[SAMPLE_COUNT] = {58,48,75,68,85,78,74,83,83,75,67,58,75,58,84,68,57,88,55,79,72};
    //for data2 excell and this algorithm both yield:
    // 0    0
    // 20   0
    // 40   0
    // 60   6
    // 80   10
    // 100  5   (correct)

    for (unsigned int binNum = 1; binNum < BIN_COUNT; ++binNum)
    {
        const int leftEdge = binranges[binNum - 1];
        const int rightEdge = binranges[binNum];

        for (unsigned int sampleNum = 0; sampleNum < SAMPLE_COUNT; ++sampleNum)
        {
            const int sample = data0[sampleNum];

            if (binNum == 1)
            {
                if (sample >= leftEdge && sample <= rightEdge)
                    bins[binNum - 1]++;
            }
            else if (sample > leftEdge && sample <= rightEdge)
            {
                bins[binNum]++;
            }
        }
    }

    for (int i = 0; i < BIN_COUNT; ++i)
        std::cout << bins[i] << " " << std::flush;

    std::cout << std::endl << std::endl;

    return 0;
}

1 个答案:

答案 0 :(得分:1)

假设边缘总是按递增顺序排列,您只需要:

     unsigned int bin;
    for (unsigned int sampleNum = 0; sampleNum < SAMPLE_COUNT; ++sampleNum)
    {
           const int sample = data0[sampleNum];
           bin = BIN_COUNT;
           for (unsigned int binNum = 0; binNum < BIN_COUNT; ++binNum)  {
                 const int rightEdge = binranges[binNum];
                 if (sample <= rightEdge) {
                    bin = binNum;
                    break;
                }
           }
           bins[bin]++;
      }

虽然为了使此代码有效,您需要为等于或低于第一个边缘(0)的值添加一个bin。

理性的是,如果你有n个分隔符,那么你有n + 1个区间。