cudaMemcpy2D导致seg故障

时间:2016-06-08 17:14:58

标签: cuda

我正在尝试使用 cudaMallocPitch() cudaMemcpy2D()分配和初始化2D数组。我已经能够使用以前的API分配几个数组,但是有一个特定的数组会导致我的程序出错。

我的代码是,

int size = totalPat * trainingSize * wordSize; // 65 * 672 * 15
char ** h_pattern = (char**) malloc((size_t) 40 * sizeof(char));

for(int = 0; i < 40; i++){
   h_pattern[i] = (char*) malloc((size_t) size * sizeof(char));
   fill_n(h_pattern[i], size, '\0');
 }

 char * d_pattern;
 size_t dpitch;
 size_t spitch = size * sizeof(char);

 cudaMallocPitch(&d_patterns, &dpitch, spitch, 40));
 cudaMemcpy2D(d_pattern, dpitch, h_pattern, spitch, spitch, 40, cudaMemcpyHostToDevice); 

我使用cuda-gdb来调试我的程序并找到问题,并在 cudaMemcpy2D()中保留seg faulting。 Backtrace提供以下输出,

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff501dd00 in cudbgGetAPIVersion () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
(cuda-gdb) backtrace
#0  0x00007ffff501dd00 in cudbgGetAPIVersion () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1  0x00007ffff4efc68e in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2  0x00007ffff4f0cc7f in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3  0x00007ffff4efd7f1 in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4  0x00007ffff4e6b322 in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#5  0x00007ffff4e74b38 in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#6  0x00007ffff4e4d92a in cuMemcpy2DUnaligned_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#7  0x000000000045bc5d in cudart::driverHelper::memcpy2DPtr(char*, unsigned long, char const*, unsigned long, unsigned long, unsigned long, cudaMemcpyKind, CUstream_st*, bool, bool) ()
#8  0x0000000000435039 in cudart::cudaApiMemcpy2DCommon(void*, unsigned long, void const*, unsigned long, unsigned long, unsigned long, cudaMemcpyKind, bool) ()
#9  0x00000000004350f8 in cudart::cudaApiMemcpy2D(void*, unsigned long, void const*, unsigned long, unsigned long, unsigned long, cudaMemcpyKind) ()
#10 0x0000000000462073 in cudaMemcpy2D ()

在devtalk论坛上有关于音高限制的问题,其中 cudaMemcpy2D()失败,音高大于2 ^ 18但是这个问题来自2007年,我认为这个限制不再存在。另外在文档中提到如果dpitch或spitch超过允许的最大值 cudaMemcpy2D()会返回错误,但它们不会告诉最大允许值。

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:1)

您的代码正在尝试将类型为40 * size的{​​{1}}字节数据复制到char类型的40字节主机内存空间。

相反,你需要为主机上的所有40个模式malloc一个线性内存空间,如:

char*