计算数组的均值C ++

时间:2018-11-04 13:52:13

标签: c++ precision

当我尝试以两种方式计算数组的均值时遇到问题。下面是代码:

float sum1, sum2, tmp, mean1, mean2;
double sum1_double, sum2_double, tmp_double;
int i, j;
int Nt=29040000;  //array size
int piecesize=32;
int Npiece=Nt/piecesize;
float* img;
float* d_img;
double* img_double;
img_double = (double*)calloc(Nt, sizeof(double));
cudaHostAlloc((void**)&img, sizeof(float)*Nt, cudaHostAllocDefault);
cudaMalloc((void**)&d_img, sizeof(float)*Nt);
...
//Some calculation is done in GPU and the results are stored in d_img;
...    
cudaMemcpy(img, d_img, Nt*sizeof(float), cudaMemcpyDeviceToHost);
for (i=0;i<Nt;i++) img_double[i]=(double)img[i];

//Method 1
sum1=0;
for (i=0;i<Nt;i++) 
{ sum1 += img[i]; }

sum1_double=0;
for (i=0;i<Nt;i++) 
{ sum1_double += img_double[i]; }

//Method 2
sum2=0;
for (i=0;i<Npiece;i++)
{   tmp=0; 
      for (j=0;j<piecesize;j++)
        { tmp += img[i*piecesize+j];}
    sum2 += tmp;
}

sum2_double=0;
for (i=0;i<Npiece;i++)
{   tmp_double=0; 
      for (j=0;j<piecesize;j++)
        { tmp_double += img_double[i*piecesize+j];}
    sum2_double += tmp_double;
}

mean1=sum1/(float)Nt;
mean2=sum2/(float)Nt;
mean1_double=sum1_double/(double)Nt;
mean2_double=sum2_double/(double)Nt;

cout<<setprecision(15)<<mean1<<endl;
cout<<setprecision(15)<<mean2<<endl;
cout<<setprecision(15)<<mean1_double<<endl;
cout<<setprecision(15)<<mean2_double<<endl;

输出:

132.221862792969
129.565872192383
129.565938340543
129.565938340543

从两种方法(均值1 = 129.6,均值2 = 132.2)获得的结果显着不同。我可以知道为什么吗?

非常感谢!

1 个答案:

答案 0 :(得分:4)

原因是浮点运算不精确。当您累积整数时,当abs(value)大于2 24 时,float变得不精确(我在这里假设IEEE-754 32位)。例如,float无法精确存储16777217(取决于舍入模式,它将变为16777216或16777218)。

假定您的第二个计算是更精确的计算,因为由于单独的tmp累加而损失了更少的精度。

sum1sum2tmp变量更改为long long int,希望两次计算都得到相同的结果。

注意:我假设您的img存储整数数据。如果存储浮标,则没有简单的方法可以完美地解决此问题。一种方法是对doublefloatsum1使用sum2而不是tmp。差别就在那里,但会小得多。并且有一些技术比简单的求和方法更精确地累积float。像Kahan Summation