当X服务器在

时间:2016-01-07 18:52:27

标签: cuda gpgpu

我正在启动我的内核并检查可能的错误,如下所示:

kernel<<<grid,block>>>(d_Basis, d_repul_aux,nao);
  cout<<"done with the ERIs...."<<endl;
  std::string error = cudaGetErrorString(cudaPeekAtLastError());
  cout<<error<<endl;

HANDLE_ERROR(cudaMemcpy(eris_gpu_cpu_aux.data(),d_repulsion_aux,eris_size*sizeof(double),cudaMemcpyDeviceToHost)); 

其中使用cudaGetErrorString(cudaPeekAtLastError())来对内核进行错误检查,我已定义:

static void HandleError( cudaError_t err,
                         const char *file,
                         int line ) {
  if (err != cudaSuccess) {
    printf( "%s in %s at line %d\n", cudaGetErrorString( err ),
            file, line );
    exit( EXIT_FAILURE );
  }
}

#define HANDLE_ERROR( err ) (HandleError( err, __FILE__, __LINE__ ))

当X服务器关闭时,计算按指定运行;但如果我打开X服务器,内核就会挂起,我得到以下输出:

done with the ERIs....
no error
the launch timed out and was terminated in main.cu at line 1038

源代码中的第1038行对应于:

了handle_error(cudaMemcpy(eris_gpu_cpu_aux.data(),d_repulsion_aux,eris_size *的sizeof(双),cudaMemcpyDeviceToHost));

当我们将结果从设备复制到主机时,计算崩溃是什么意思。我使用的是显卡GEforce GTx-480和CUDA 7.5。

尝试解决问题,我尝试关闭/etc/X11/xorg.conf文件中的“交互”选项,但X服务器无法识别此选项。为了在X Server和GPGPU应用程序之间共享GPU资源,我该怎么办?我坚持这一点,因为我无法使用文本模式环境编写和/或调试我的代码。

1 个答案:

答案 0 :(得分:1)

我以前的/etc/X11/xorg.conf文件如下:

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 319.21  (buildmeister@swio-display-x86-rhel47-14)  Sun May 12 00:46:48 PDT 2013


Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

为了解决这个问题,我们必须按如下方式禁用看门狗超时:

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 319.21  (buildmeister@swio-display-x86-rhel47-14)  Sun May 12 00:46:48 PDT 2013


Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0" 0 0
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"

    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    HorizSync       28.0 - 33.0
    VertRefresh     43.0 - 72.0
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
##
##  disable watchdog timeouts for long-running CUDA kernels
##
    Option "Interactive" "false"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection