Question

我正在尝试让cuda在我的网络摄像头视频上执行内核。

我希望这样，从我的网络摄像头检索数据，将其发送到我的GPU，处理内核，然后将生成的图像发送回去。

#include "cuda.h"
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
#include <Windows.h>
#include "Bitmap.h"

#include "OpenCVTest.h"

#include "OpenCVTest.h"
#include <opencv2/opencv.hpp>

using namespace cv;

#define Pixel unsigned char


__global__ void TestKernel(unsigned char * img)
{
    int index = threadIdx.x + blockIdx.x * blockDim.x;
    img[index] = 100;
}

int main(void) 
{
    VideoCapture cap(0); 
    Mat input;
    Mat frame;
    Mat Output;
    cap >> frame;
    //cap >> Output;
    cvtColor(frame, Output, CV_BGR2GRAY);
    uchar *d_frame;
    size_t size = (int) (640 * 480);
    cudaMalloc((void **)&d_frame, size);

    namedWindow("Window",1);
    for(;;)
    {
        cap >> input; 
        cvtColor(input, frame, CV_BGR2GRAY);        

        cudaMemcpy(d_frame, frame.data, size, cudaMemcpyHostToDevice);

        TestKernel<<<640 * 480, 1>>>( d_frame );

        cudaMemcpy(Output.data, d_frame, size, cudaMemcpyDeviceToHost);

        imshow("Window", Output);
        if(waitKey(30) >= 0) break;
    }

    cudaFree(d_frame);

    return 0;
}

我刚刚编写了一个非常基本的测试内核。但看起来内核没有被执行，因为我显示的图像只是来自我网络摄像头的灰度视频。

修改

随着罗伯特的消化，我通过添加

添加了一些错误检查

gpuErrchk( cudaPeekAtLastError() );

调用内核后

gpuErrchk在哪里

#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, char *file, int line, bool abort=true)
{
   if (code != cudaSuccess) 
   {
      fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
      if (abort) exit(code);
   }
}

GPUError

Answer 1

640 * 480 = 307200

除非您已经为cc 3.0或更高版本的GPU编译代码，否则这不是内核第一个配置参数的可接受选择：

    TestKernel<<<640 * 480, 1>>>( d_frame );

对于pre-cc3.0设备，dim3数量的前2个维度的第一个参数（即Maximum x-dimension of a grid of thread blocks）is limited to 65535。

如果您执行proper cuda error checking，您会发现内核未运行（和/或其他错误。）您还可以尝试使用cuda-memcheck作为快速测试运行代码。

使用cuda进行图像处理的Hello world

1 个答案: