Valgrind和CUDA:据报道泄漏是真的吗?

时间:2013-12-15 10:21:18

标签: memory-leaks cuda valgrind

我的应用程序中有一个非常简单的CUDA组件。 Valgrind报告了大量泄漏和仍然可达,这些都与cudaMalloc调用有关。

这些泄漏真的存在吗?我为每个cudaFree致电cudaMalloc。这个valgrind无法解释GPU内存分配吗?如果这些泄漏不是真的,我可以抑制它们并让valgrind只分析应用程序的非gpu部分吗?

extern "C"
unsigned int *gethash(int nodec, char *h_nodev, int len) {
    unsigned int *h_out = (unsigned int *)malloc(sizeof(unsigned int) * nodec);

    char *d_in;
    unsigned int *d_out;

    cudaMalloc((void**) &d_in, sizeof(char) * len * nodec);
    cudaMalloc((void**) &d_out, sizeof(unsigned int) * nodec);

    cudaMemcpy(d_in, h_nodev, sizeof(char) * len * nodec, cudaMemcpyHostToDevice);

    int blocks = 1 + nodec / 512;


    cube<<<blocks, 512>>>(d_out, d_in, nodec, len);

    cudaMemcpy(h_out, d_out, sizeof(unsigned int) * nodec, cudaMemcpyDeviceToHost);

    cudaFree(d_in);
    cudaFree(d_out);
    return h_out;

}

Valgrind输出的最后一位:

...
==5727== 5,468 (5,020 direct, 448 indirect) bytes in 1 blocks are definitely lost in loss record 506 of 523
==5727==    at 0x402B965: calloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==5727==    by 0x4843910: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x48403E9: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x498B32D: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x494A6E4: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x4849534: ??? (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x48191DD: cuInit (in /usr/lib/nvidia-319-updates/libcuda.so.319.60)
==5727==    by 0x406B4D6: ??? (in /usr/lib/i386-linux-gnu/libcudart.so.5.0.35)
==5727==    by 0x406B61F: ??? (in /usr/lib/i386-linux-gnu/libcudart.so.5.0.35)
==5727==    by 0x408695D: cudaMalloc (in /usr/lib/i386-linux-gnu/libcudart.so.5.0.35)
==5727==    by 0x804A006: gethash (hashkernel.cu:36)
==5727==    by 0x804905F: chkisomorphs (bdd.c:326)
==5727== 
==5727== LEAK SUMMARY:
==5727==    definitely lost: 10,240 bytes in 6 blocks
==5727==    indirectly lost: 1,505 bytes in 54 blocks
==5727==      possibly lost: 7,972 bytes in 104 blocks
==5727==    still reachable: 626,997 bytes in 1,201 blocks
==5727==         suppressed: 0 bytes in 0 blocks

5 个答案:

答案 0 :(得分:6)

这是一个众所周知的问题,valgrind报告了一堆CUDA内容的误报。避免看到它的最好方法是使用valgrind抑制,你可以在这里阅读所有内容: http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress

如果你想快速开始更接近你的具体问题,那么在Nvidia开发论坛上有一个有趣的帖子。它有一个指向样本抑制规则文件的链接。 https://devtalk.nvidia.com/default/topic/404607/valgrind-3-4-suppressions-a-little-howto/

答案 1 :(得分:2)

我不相信valgrind或任何其他使用CUDA的检漏仪(如VLD)。我确信它们的设计并没有考虑GPU分配。我不知道Nvidia的Nsight现在是否具有这种能力(我现在已经有近六个月没有完成GPU编程了),但这是我用于CUDA调试的最好的东西,而且说实话,它就像地狱一样

您发布的代码不应该造成泄密。

答案 2 :(得分:2)

尝试使用cuda-memcheck --leak-check full。 Cuda-memcheck是一组工具,为Valgrind提供与CUDA应用程序类似的功能。它作为CUDA工具包的一部分安装。您可以在此处获取有关如何使用cuda-memcheck的更多文档:http://docs.nvidia.com/cuda/cuda-memcheck/

请注意cuda-memcheck不是valgrind的直接替代品,不能用于检测主机端内存泄漏或缓冲区溢出。

答案 3 :(得分:2)

要添加到scarl3tt的答案,对于某些应用程序来说这可能过于笼统,但如果您想在忽略大多数cuda问题时使用valgrind,请使用选项--suppressions = valgrind-cuda.supp,其中valgrind-cuda。 supp是一个包含以下规则的文件:

{
   alloc_libcuda
   Memcheck:Leak
   match-leak-kinds: reachable,possible
   fun:*alloc
   ...
   obj:*libcuda.so*
   ...
}

{
   alloc_libcufft
   Memcheck:Leak
   match-leak-kinds: reachable,possible
   fun:*alloc
   ...
   obj:*libcufft.so*
   ...
}

{
   alloc_libcudaart
   Memcheck:Leak
   match-leak-kinds: reachable,possible
   fun:*alloc
   ...
   obj:*libcudart.so*
   ...
}

答案 4 :(得分:0)

由于我没有50个声誉,因此我无法对@Vyas的答案发表评论。

让我感到奇怪的是cuda-memcheck无法观察到cuda内存泄漏

我只是编写了一个非常简单的代码,但有cuda内存泄漏,但是当使用public partial class Assignment { public int Id { get; set; } public string Uuid { get; set; } public DateTime? ExpectedMasteryDate { get; set; } public int FirstRange { get; set; } public int SecondRange { get; set; } public int GroupType { get; set; } public int LearningStrategyId { get; set; } public int TeachingStrategyId { get; set; } public string Name { get; set; } public string Uic { get; set; } public string Description { get; set; } public DateTime StartDate { get; set; } public DateTime EndDate { get; set; } public byte ExcludeWeekend { get; set; } public string Duration { get; set; } public string File { get; set; } public string Youtubevideourl { get; set; } public string Link { get; set; } public int Status { get; set; } public DateTime? CreatedAt { get; set; } public int? CreatedBy { get; set; } public DateTime? UpdatedAt { get; set; } public int? ModifiedBy { get; set; } public int AssignmentTypeId { get; set; } [ForeignKey("AssignmentTypeId")] public AssignmentType AssignmentType { get; set; } public int LessonId { get; set; } [ForeignKey("LessonId")] public Lesson Lessons { get; set; } } 时,它不会泄漏。是:

cuda-memcheck --leak-check full

请注意,注释的代码行使该程序成为cuda内存泄漏。但是,执行#include <iostream> #include <cuda_runtime.h> using namespace std; int main(){ float* cpu_data; float* gpu_data; int buf_size = 10 * sizeof(float); cpu_data = (float*)malloc(buf_size); for(int i=0; i<10; i++){ cpu_data[i] = 1.0f * i; } cudaError_t cudaStatus = cudaMalloc(&gpu_data, buf_size); cudaMemcpy(gpu_data, cpu_data, buf_size, cudaMemcpyHostToDevice); free(cpu_data); //cudaFree(gpu_data); return 0; } 时会给出:

cuda-memcheck ./a.out