我正在开发GPU / OpenCL NBody代码。我使用AMD APP SDK的OpenGL渲染粒子位置。运行代码时,我有随机分段错误。
总而言之,我有一个GLWidget,我在其中进行OpenGL渲染。生成初始位置后,我将在此GLWidget中渲染它们。之后,我运行模拟,并在每一步计算下一个位置并在GLwidget中显示它们。我的问题是,有时候,如果我在模拟运行时点击参数GUI的“生成初始条件”按钮,我就会出现分段错误:
这是回溯:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff4a46cd7 in memcpy () from /lib/libc.so.6
(gdb) bt
#0 0x00007ffff4a46cd7 in memcpy () from /lib/libc.so.6
#1 0x00007fffeda2da64 in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#2 0x00007fffedbba74a in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#3 0x00007fffedbba9af in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#4 0x00007fffed9c56e4 in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#5 0x00007fffed17371d in ?? () from /usr/lib/x86_64-linux-gnu/dri/fglrx_dri.so
#6 0x000000000040b185 in GLWidget::createVBO() ()
#7 0x000000000040b3c9 in GLWidget::draw() ()
#8 0x000000000040c36d in GLWidget::processCurrent() ()
...
这是createVBO
例程:
void GLWidget::createVBO()
{
GLuint vbo;
int memSize = sizeof(cl_double4) * 4 * Galaxy->getNumParticles();
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, memSize, Galaxy->pos, GL_DYNAMIC_DRAW);
}
段错误发生在glBufferData(GL_ARRAY_BUFFER, memSize, Galaxy->pos, GL_DYNAMIC_DRAW);
我不明白为什么会这样。当我按下“生成IC”按钮时,我删除已分配的Galaxy->pos
数组并创建一个新数组。
以下是我在“生成IC”例程中所做的事情:
//Clean Galaxy already existing
if (parent->widget_2->isGalaxyExist)
{
if (parent->widget_2->animation)
parent->resetSimu();
parent->widget_2->Galaxy->cleanup();
}
使用cleanup
例程(我删除pos
数组):
int NBody::cleanup()
{
if (glEvent)
clReleaseEvent(glEvent);
// Releases OpenCL resources (Context, Memory etc.)
cl_int status;
if (hasRunKernel)
{
status = clFinish(commandQueue);
CHECK_OPENCL_ERROR(status, "clFinish failed.(commandQueue)");
status = clReleaseKernel(kernel);
CHECK_OPENCL_ERROR(status, "clReleaseKernel failed.(kernel)");
status = clReleaseProgram(program);
CHECK_OPENCL_ERROR(status, "clReleaseProgram failed.(program)");
status = clReleaseMemObject(currPos);
CHECK_OPENCL_ERROR(status, "clReleaseMemObject failed.(currPos)");
status = clReleaseMemObject(currVel);
CHECK_OPENCL_ERROR(status, "clReleaseMemObject failed.(currVel)");
status = clReleaseMemObject(newPos);
CHECK_OPENCL_ERROR(status, "clReleaseMemObject failed.(newPos)");
status = clReleaseMemObject(newVel);
CHECK_OPENCL_ERROR(status, "clReleaseMemObject failed.(newVel)");
status = clReleaseCommandQueue(commandQueue);
CHECK_OPENCL_ERROR(status, "clReleaseCommandQueue failed.(commandQueue)");
status = clReleaseContext(context);
CHECK_OPENCL_ERROR(status, "clReleaseContext failed.(context)");
hasRunKernel = false;
}
// Release program resources
delete [] pos;
delete [] vel;
delete [] initPos;
delete [] initVel;
delete [] devices;
// Delete current instance
delete this;
return NBODY_SUCCESS;
}
乍一看,你能看出什么是错的,或者给我一个关于这个段错误的线索。最令人讨厌的是,错误是随机发生的,而不是每次都执行。
答案 0 :(得分:1)
这个计算是否正确?
int memSize = sizeof(cl_double4) * 4 * Galaxy->getNumParticles();
特别是“* 4”:sizeof(cl_double4)已经考虑了向量的四个元素。
答案 1 :(得分:1)
这样的崩溃表明通过glBufferData
OpenGL API函数调用的驱动程序代码中的越界访问。检查传递给glBufferData
的参数是否正确,即给glBufferData读取的长度是否作为数据参数传递的内存范围内。