问题简介
我正在尝试加快我正在编写的(2d)光线跟踪器的交集代码。我正在使用C#和System.Numerics库来提高SIMD指令的速度。
问题在于,我得到了奇怪的结果,屋顶加速和相当低的加速。我的问题是,为什么一个人过高而另一个人过低?
上下文:
尝试
我已经尝试查找某些操作是否受到适当支持(请注意:这些操作适用于c ++。https://fgiesen.wordpress.com/2016/04/03/sse-mind-the-gap/或http://sci.tuomastonteri.fi/programming/sse),但似乎并非如此因为我使用的笔记本电脑支持SSE 4.2。
在当前代码中,将应用以下更改:
代码...代码段?
为大量的代码表示歉意,但是我不确定如果没有那么多的代码,我们如何才能具体讨论这个问题。
射线代码-> BoundingBox
public bool CollidesWith(Ray ray, out float t)
{
// https://gamedev.stackexchange.com/questions/18436/most-efficient-aabb-vs-ray-collision-algorithms
// r.dir is unit direction vector of ray
float dirfracx = 1.0f / ray.direction.X;
float dirfracy = 1.0f / ray.direction.Y;
// lb is the corner of AABB with minimal coordinates - left bottom, rt is maximal corner
// r.org is origin of ray
float t1 = (this.rx.min - ray.origin.X) * dirfracx;
float t2 = (this.rx.max - ray.origin.X) * dirfracx;
float t3 = (this.ry.min - ray.origin.Y) * dirfracy;
float t4 = (this.ry.max - ray.origin.Y) * dirfracy;
float tmin = Math.Max(Math.Min(t1, t2), Math.Min(t3, t4));
float tmax = Math.Min(Math.Max(t1, t2), Math.Max(t3, t4));
// if tmax < 0, ray (line) is intersecting AABB, but the whole AABB is behind us
if (tmax < 0)
{
t = tmax;
return false;
}
// if tmin > tmax, ray doesn't intersect AABB
if (tmin > tmax)
{
t = tmax;
return false;
}
t = tmin;
return true;
}
RayPack的代码-> BoundingBoxPack
public Vector<int> CollidesWith(ref RayPack ray, out Vector<float> t)
{
// ------------------------------------------------------- \\
// compute the collision. \\
Vector<float> dirfracx = Constants.ones / ray.direction.x;
Vector<float> dirfracy = Constants.ones / ray.direction.y;
Vector<float> t1 = (this.rx.min - ray.origin.x) * dirfracx;
Vector<float> t2 = (this.rx.max - ray.origin.x) * dirfracx;
Vector<float> t3 = (this.ry.min - ray.origin.y) * dirfracy;
Vector<float> t4 = (this.ry.max - ray.origin.y) * dirfracy;
Vector<float> tmin = Vector.Max(Vector.Min(t1, t2), Vector.Min(t3, t4));
Vector<float> tmax = Vector.Min(Vector.Max(t1, t2), Vector.Max(t3, t4));
Vector<int> lessThanZeroMask = Vector.GreaterThan(tmax, Constants.zeros);
Vector<int> greaterMask = Vector.LessThan(tmin, tmax);
Vector<int> combinedMask = Vector.BitwiseOr(lessThanZeroMask, greaterMask);
// ------------------------------------------------------- \\
// Keep track of the t's that collided. \\
t = Vector.ConditionalSelect(combinedMask, tmin, Constants.floatMax);
return combinedMask;
}
射线代码->圆
public bool Intersect(Circle other)
{
// Step 0: Work everything out on paper!
// Step 1: Gather all the relevant data.
float ox = this.origin.X;
float dx = this.direction.X;
float oy = this.origin.Y;
float dy = this.direction.Y;
float x0 = other.origin.X;
float y0 = other.origin.Y;
float cr = other.radius;
// Step 2: compute the substitutions.
float p = ox - x0;
float q = oy - y0;
float r = 2 * p * dx;
float s = 2 * q * dy;
// Step 3: compute the substitutions, check if there is a collision.
float a = dx * dx + dy * dy;
float b = r + s;
float c = p * p + q * q - cr * cr;
float DSqrt = b * b - 4 * a * c;
// no collision possible! Commented out to make the benchmark more fair
//if (DSqrt < 0)
//{ return false; }
// Step 4: compute the substitutions.
float D = (float)Math.Sqrt(DSqrt);
float t0 = (-b + D) / (2 * a);
float t1 = (-b - D) / (2 * a);
float ti = Math.Min(t0, t1);
if(ti > 0 && ti < t)
{
t = ti;
return true;
}
return false;
}
RayPack的代码-> CirclePack 未经编辑的原始代码可以在https://pastebin.com/87nYgZrv
中找到public Vector<int> Intersect(CirclePack other)
{
// ------------------------------------------------------- \\
// Put all the data on the stack. \\
Vector<float> zeros = Constants.zeros;
Vector<float> twos = Constants.twos;
Vector<float> fours = Constants.fours;
// ------------------------------------------------------- \\
// Compute whether the ray collides with the circle. This \\
// is stored in the 'mask' vector. \\
Vector<float> p = this.origin.x - other.origin.x; ;
Vector<float> q = this.origin.y - other.origin.y;
Vector<float> r = twos * p * this.direction.x;
Vector<float> s = twos * q * this.direction.y; ;
Vector<float> a = this.direction.x * this.direction.x + this.direction.y * this.direction.y;
Vector<float> b = r + s;
Vector<float> c = p * p + q * q - other.radius * other.radius;
Vector<float> DSqrt = b * b - fours * a * c;
Vector<int> maskCollision = Vector.GreaterThan(DSqrt, zeros);
// Commented out to make the benchmark more fair.
//if (Vector.Dot(maskCollision, maskCollision) == 0)
//{ return maskCollision; }
// ------------------------------------------------------- \\
// Update t if and only if there is a collision. Take \\
// note of the conditional where we compute t. \\
Vector<float> D = Vector.SquareRoot(DSqrt);
Vector<float> bMinus = Vector.Negate(b);
Vector<float> twoA = twos * a;
Vector<float> t0 = (bMinus + D) / twoA;
Vector<float> t1 = (bMinus - D) / twoA;
Vector<float> tm = Vector.ConditionalSelect(Vector.LessThan(t1, t0), t1, t0);
Vector<int> maskBiggerThanZero = Vector.GreaterThan(tm, zeros);
Vector<int> maskSmallerThanT = Vector.LessThan(tm, this.t);
Vector<int> mask = Vector.BitwiseAnd(
maskCollision,
Vector.BitwiseAnd(
maskBiggerThanZero,
maskSmallerThanT)
);
this.t = Vector.ConditionalSelect(
mask, // the bit mask that allows us to choose.
tm, // the smallest of the t's.
t); // if the bit mask is false (0), then we get our original t.
return mask;
}
组装代码
这些可以在pastebin上找到。请注意,基准工具中有一些样板组件。您需要查看函数调用。
基准化
我一直在用BenchmarkDotNet进行基准测试。
Circle / CirclePack的结果(已更新):
| Method | Mean | Error | StdDev |
|------------------- |---------:|----------:|----------:|
| Intersection | 9.710 ms | 0.0540 ms | 0.0505 ms |
| IntersectionPacked | 3.296 ms | 0.0055 ms | 0.0051 ms |
BoundingBox / BoundingBoxPacked的结果:
| Method | Mean | Error | StdDev |
|------------------- |----------:|----------:|----------:|
| Intersection | 24.269 ms | 0.2663 ms | 0.2491 ms |
| IntersectionPacked | 1.152 ms | 0.0229 ms | 0.0264 ms |x
由于采用了AVX,因此预计速度会提高6到8倍左右。边界框的加速是 significant ,而圆的加速却很低。
重温最上方的问题:为什么一个加速是过高的,而另一个加速却很低?以及两者中的较低者(CirclePack)如何变得更快?
关于Peter Cordes的编辑(评论)