对于调试,我需要查看结构的值。我是Cuda的新手,因此可能会导致Nsight调试器使用错误。我的设置是Developper Studio 2015,Cuda9。
我在Nsight中的函数中设置了一个断点。 在此函数内部,一切看起来都很不错。 该函数称为TraceTheRay_GPU()。 它填充了一个名为“ AbstractIntersection”的结构。 只要我留在函数值内 正常,normal_geom等都很好。
当我退出时,Nsight会显示该结构的随机值。可能是什么原因?
可视化结构本身确实起作用,因此当我在glview中将法线显示为假色时,它看起来不错。只是Nsight会显示错误的值。
内部函数的值是(例如):
isect->point { x = -57.28 , y = - 53.74, z = 0}
isect->normal { x = 0, y = 0, z = 1}
isect->normal_geom { x = 0, y = 0, z = 1}
isect->dist 126.87
退出是:
is.point { x = -149.02 , y = -24.74, z = -56.82}
is.normal { x = 95.28, y = 24.74, z = 2.8e-45}
is.normal_geom { x = 0, y = 0, z = 0}
is.dist 100
Nsight似乎看到了另一段内存。 这是更多代码,包括struct和 函数的调用方式。
// pseudo code:
kernel function() {
AbstractIntersection is;
res = TraceTheRay_GPU(r, num_tri, &is);
// looking at "is" inside Nsight shows wrong values
// looking at "is" inside TraceTheRay_GPU shows good values
// displaying is.normal as false color shows a good normal.
}
// real code
struct AbstractIntersection {
float3 point;
float3 normal;
float3 normal_geom;
float dist;
};
__device__ int TraceTheRay_GPU(const Ray &r, const int number_of_triangles , AbstractIntersection *isect) {
float min_t = UINT_MAX;
int hit_index = -1;
for (int i = 0; i < number_of_triangles; i++)
{
float4 v0 = tex1Dfetch(triangle_texture, i * 3);
float4 e1 = tex1Dfetch(triangle_texture, i * 3 + 1);
float4 e2 = tex1Dfetch(triangle_texture, i * 3 + 2);
float t = RayTriangleIntersection(r, make_float3(v0.x, v0.y, v0.z), make_float3(e1.x, e1.y, e1.z), make_float3(e2.x, e2.y, e2.z));
//float t = 0;
if (t < min_t && t > 0.001)
{
min_t = t;
hit_index = i;
}
}
if (hit_index > -1) {
float4 e1 = tex1Dfetch(triangle_texture, hit_index * 3 + 1);
float4 e2 = tex1Dfetch(triangle_texture, hit_index * 3 + 2);
isect->normal = cross(make_float3(e1.x, e1.y, e1.z), make_float3(e2.x, e2.y, e2.z));
isect->normal = normalize(isect->normal);
isect->normal_geom = isect->normal; // to start with a valid structure.
isect->point = r.ori + r.dir *min_t;
isect->dist = min_t;
return 1;
}
else
return 0;
}
__global__ void raytrace( unsigned int *out_data,
const int w,
const int h,
const int number_of_triangles,
const float3 a, const float3 b, const float3 c,
const float3 campos,
const float3 light_pos,
const float3 light_color,
const float3 scene_aabb_min,
const float3 scene_aabb_max,
curandState * devState)
{
unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;
unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;
float xf = (x-0.5)/((float)w);
float yf = (y-0.5)/((float)h);
int ray_depth = 0;
bool inbox = true;
float3 t1 = c+(a*xf);
float3 t2 = b*yf;
float3 image_pos = t1 + t2;
Ray r(image_pos,image_pos-campos);
float3 result = make_float3(0, 0, 0);
AbstractIntersection is; // there is no init at the moment
// full code catches this with a test for "res".
int res = 0;
int num_tri; // = number_of_triangles;
num_tri = 12;
res = TraceTheRay_GPU(r, num_tri, &is);
// content of "is" is not shown correctly in Nsight.
// display the normal as falsecolor (like normalmap)
result = make_float3((is.normal.x+1.0)*0.5, (is.normal.y+1.0)*0.5, (is.normal.z+1.0)*0.5);
int val = rgbToInt(result.x*255,result.y*255,result.z*255);
out_data[y * w + x] = val;
}
编辑:在罗伯茨回答后,我添加了一个额外的功能,因此调试器被迫真正显示值。为避免某些优化程序将其删除,这样做会增加一些废话,这些废话也会添加到最终结果中。
__device__ int checkMyData(AbstractIntersection * is) {
int ret_val = 0;
if (is->normal.x > 0.2)
ret_val = 17;
else
ret_val = 18;
return ret_val;
}
然后在Nsight中,我可以看到第一行的“ is”的正确值 checkMyData()。我也可以做一个setp(f10)并仍然看到这些值。 它们确实会在第一个if()语句中消失。我希望这可以帮助Cuda和Nsight的新手。因此,调试器中变量的范围是断点+下一次迭代,直到命中第一个if为止。