我试图加快计算球体体积的代码(参见下面的代码)。 这个球体的体积是通过计算小体积片段dv,并将它们相加成一个体积来生成的。
实际上,在将计算应用于具有对称属性的其他球体之前,此代码只是一个健全性检查,因此我应该能够通过计算小体积并乘以结束来提高代码的速度结果
在while(phid< =(360.0 / adstep))和while(thetad< =(180.0 / adstep))中分别用180和90替换360和180并且你需要四分之一的计算意味着你可以简单地乘以最终vol 4.0。
如果我将phi设置为180并将θ保持为180,将计算减半,则此方法有效。 但是当我将theta设置为90时,它并不喜欢它。
输出继电器:
Phi 360, Theta 180
Actual Volume Calculated Volume % Difference
4.18879020478639053 4.18878971565348923 0.00001167718922403
Phi 180, Theta 180
4.18879020478639053 4.18878971565618219 0.00001167712493440
Phi 180, Theta 90
4.18879020478639053 4.18586538829648180 0.06987363946500515
您可以在上面看到前两个计算几乎相同(我假设差异是由于精度误差),而最后一个计算得到的结果明显不同。嵌套循环会导致问题吗?
任何帮助都会受到赞赏,因为我没有在我的研究中发现任何内容(google& stack overflow)来描述我遇到的问题。
#include <iostream>
#include <iomanip>
#include <cmath>
using namespace std;
int main()
{
double thetar, phir, maxr, vol, dv, vol2, arstep, adstep, rad, rad3, thetad, phid, ntheta, nphi;
cout << fixed << setprecision(17); // Set output precision to defined number of decimal places. Note Double has up to 15 decimal place accuracy
vol=0.0; // Initialise volume and set at zero
adstep=0.1; // Steps to rotate angles in degrees
arstep=(adstep/180.0)*M_PI; // Angle steps in radians
phid=1.0; // Phi in degrees starting at adstep
maxr = 1.0; // Radius of the sphere
// Loop to calculate volume
while (phid<=(360.0/adstep)) // Loop over Phi divided by adstep. This scales the loop to the desired number of calculations.
{
phir=((phid*adstep)/180.0)*M_PI; // Phi in radians
thetad=1.0; // Theta in degrees, reset to initial adstep value
while (thetad<=(180.0/adstep)) // Loop over Theta divided by adstep. Like Phi loop, this scales the loop to the desired number of calculations
{
thetar=((thetad*adstep)/180.0)*M_PI; // Convert theta degrees to radians
dv = ((maxr*maxr*maxr) * sin(thetar) * arstep * arstep) / 3.0; // Volume of current segment
vol += dv; // Summing all the dv value together to generate a global volume
thetad+=1.0; // Increase theta (degrees) by a single step
}
phid+=1.0; // Increase phi (degrees) by a single step
}
vol = vol*1.0; // Volume compensated for any reduction in phi and theta
rad3 = (3.0*vol)/(4.0*M_PI); // volume equivalent radius^3
rad = pow(rad3,(1.0/3.0)); // volume equivalent radius
vol2 = (4.0/3.0)*M_PI*(maxr*maxr*maxr); // Calculated volume of a sphere given initial maxr
// Diagnostic output
cout << vol2 << " " << vol << " " << ((vol2-vol)/vol)*100.0 << endl;
}
编辑:将phid和thetad的起始值更正为1.0
编辑2: 我只想更新,对于未来的观众来说,使用Kahan求和算法(https://en.wikipedia.org/wiki/Kahan_summation_algorithm)几乎否定了我的所有精度误差,因为将一小部分加到一个大数字上。还有其他方法,但这是最简单的方法之一,我需要它做的工作。 对于后代,这是从维基百科页面上获取的关于主题的示例psuedocode:
function KahanSum(input)
var sum = 0.0
var c = 0.0 // A running compensation for lost low-order bits.
for i = 1 to input.length do
var y = input[i] - c // So far, so good: c is zero.
var t = sum + y // Alas, sum is big, y small, so low-order digits of y are lost.
c = (t - sum) - y // (t - sum) recovers the high-order part of y; subtracting y recovers -(low part of y)
sum = t // Algebraically, c should always be zero. Beware overly-aggressive optimizing compilers!
// Next time around, the lost low part will be added to y in a fresh attempt.
return sum
答案 0 :(得分:1)
就速度而言,我怀疑(没有描述它)浪费了很多时间在弧度和度数之间进行转换,并且还计算了所有sin
s。 AFAICT,thetar
在外循环的每次迭代期间循环遍历相同的值,因此在主循环之前预先计算sin(thetar)
一次可能更有效,并在内部执行简单查找循环。
至于数值稳定性,当vol
越来越大于dv
时,随着时间的推移,你将开始失去越来越多的精度。如果你可以将所有dv
存储在一个数组中,然后使用分而治之的方法而不是线性传递来对它求和,原则上你会得到更好的结果。然后我再次计算(仅)6 480 000
总迭代次数,所以我认为double
累加器(保持15-17个显着的基数为10位)实际上可以处理丢失6-7位而没有太多麻烦。
答案 1 :(得分:0)
最有可能的问题是:在您需要之前退出循环1迭代。您不应该将浮点数比较为相等。解决这个问题的一种快速方法是添加一个小常量,例如
while(thetad&lt;(180.0 / adstep) + 1e-8 )
答案 2 :(得分:0)
这不是一个非常彻底的分析,但可能会让您深入了解错误的来源。在您的代码中,您正在累积3240000浮点数的值。随着vol
的值增加,dv
和vol
之间的比率会增加,您在添加中会失去越来越多的精确度。
减少将多个值累积到单个值(称为减少总和)的精度损失的标准方法是在块中执行添加:例如,您可以将每个值相加8个值并将它们存储到一个数组中,然后将该数组的每8个值加在一起,等等,直到剩下一个值。这可能会让你获得更好的结果。
此外,值得考虑的是您在球面上进行线性步长,因此您不能均匀地对球体进行采样。这可能会影响您的最终结果。均匀采样球体的一种方法是在方位角phi
中采用从0到360度的线性步长,并将极化角度acos
的范围从{1}调整为1 {1} }。有关更详细的说明,请参阅this link on sphere point-picking。
答案 3 :(得分:0)
首先,我认为,你的函数中有几个错误。我认为phid
和thetad
都应初始化为0
或1.0
。
其次,通过减少浮点乘法的数量可以获得相当多的收益。
在下面的代码中,我将main
函数的内容移至volume1
并创建了一个包含略微优化代码的函数volume2
。
#include <iostream>
#include <iomanip>
#include <cmath>
#include <ctime>
using namespace std;
void volume1(int numSteps)
{
double thetar, phir, maxr, vol, dv, vol2, arstep, adstep, rad, rad3, thetad, phid, ntheta, nphi;
cout << fixed << setprecision(17); // Set output precision to defined number of decimal places. Note Double has up to 15 decimal place accuracy
vol=0.0; // Initialise volume and set at zero
adstep=360.0/numSteps; // Steps to rotate angles in degrees
arstep=(adstep/180.0)*M_PI; // Angle steps in radians
phid=1.0; // Phi in degrees starting at adstep
maxr = 1.0; // Radius of the sphere
// Loop to calculate volume
while (phid<=(360.0/adstep)) // Loop over Phi divided by adstep. This scales the loop to the desired number of calculations.
{
phir=((phid*adstep)/180.0)*M_PI; // Phi in radians
thetad=1.0; // Theta in degrees, reset to initial adstep value
while (thetad<=(180.0/adstep)) // Loop over Theta divided by adstep. Like Phi loop, this scales the loop to the desired number of calculations
{
thetar=((thetad*adstep)/180.0)*M_PI; // Convert theta degrees to radians
dv = ((maxr*maxr*maxr) * sin(thetar) * arstep * arstep) / 3.0; // Volume of current segment
vol += dv; // Summing all the dv value together to generate a global volume
thetad+=1.0; // Increase theta (degrees) by a single step
}
phid+=1.0; // Increase phi (degrees) by a single step
}
vol = vol*1.0; // Volume compensated for any reduction in phi and theta
rad3 = (3.0*vol)/(4.0*M_PI); // volume equivalent radius^3
rad = pow(rad3,(1.0/3.0)); // volume equivalent radius
vol2 = (4.0/3.0)*M_PI*(maxr*maxr*maxr); // Calculated volume of a sphere given initial maxr
// Diagnostic output
cout << vol2 << " " << vol << " " << ((vol2-vol)/vol)*100.0 << endl << endl;
}
void volume2(int numSteps)
{
double thetar, maxr, vol, vol2, arstep, adstep, rad, rad3, thetad, phid, ntheta, nphi;
cout << fixed << setprecision(17); // Set output precision to defined number of decimal places. Note Double has up to 15 decimal place accuracy
vol=0.0; // Initialise volume and set at zero
adstep = 360.0/numSteps;
arstep=(adstep/180.0)*M_PI; // Angle steps in radians
maxr = 1.0; // Radius of the sphere
double maxRCube = maxr*maxr*maxr;
double arStepSquare = arstep*arstep;
double multiplier = maxRCube*arStepSquare/3.0;
// Loop to calculate volume
int step = 1;
for ( ; step <= numSteps; ++step )
{
int numInnerSteps = numSteps/2;
thetad = adstep; // Theta in degrees, reset to initial adstep value
for ( int innerStep = 1; innerStep <= numInnerSteps; ++innerStep )
{
thetar = innerStep*arstep;
vol += multiplier * sin(thetar); // Volume of current segment
}
}
vol = vol*1.0; // Volume compensated for any reduction in phi and theta
rad3 = (3.0*vol)/(4.0*M_PI); // volume equivalent radius^3
rad = pow(rad3,(1.0/3.0)); // volume equivalent radius
vol2 = (4.0/3.0)*M_PI*(maxr*maxr*maxr); // Calculated volume of a sphere given initial maxr
// Diagnostic output
cout << vol2 << " " << vol << " " << ((vol2-vol)/vol)*100.0 << endl << endl;
}
int main()
{
int numSteps = 3600;
clock_t start = clock();
volume1(numSteps);
clock_t end1 = clock();
volume2(numSteps);
clock_t end2 = clock();
std::cout << "CPU time used: " << 1000.0 * (end1-start) / CLOCKS_PER_SEC << " ms\n";
std::cout << "CPU time used: " << 1000.0 * (end2-end1) / CLOCKS_PER_SEC << " ms\n";
}
我得到的输出,使用g ++ 4.7.3:
4.18879020478639053 4.18762558892993564 0.02781088785811153 4.18879020478639053 4.18878914146923176 0.00002538483372773 CPU time used: 639.00000000000000000 ms CPU time used: 359.00000000000000000 ms
这让你的成绩提高了约44%。