Question

问题非常简单，但让我概述一下我的框架。我有一个抽象类AbstractScheme代表一种计算类型（一种方程式的离散化，但这并不重要）。每个实现都必须提供一个返回方案名称的方法，并且必须实现一个受保护的函数，即CUDA内核。基本抽象类提供了一个公共方法，它调用CUDA内核并返回内核完成所需的时间。

class AbstractScheme
{
public:
    /**
     * @return The name of the scheme is returned
     */
    virtual std::string name() const =0;

    /**
     * Copies the input to the device,
     * computes the number of blocks and threads,
     * launches the kernel,
     * copies the output to the host,
     * and measures the time to do all of this.
     *
     * @return The number of milliseconds to perform the whole operation
     *         is returned
     */
    double doComputation(const float* input, float* output, int nElements)
    {
        // Does a lot of things and calls this->kernel().
    }

protected:
    /**
     * CUDA kernel which does the computation.
     * Must be implemented.
     */
    virtual __global__ void kernel(const float*, float*, int) =0;
};

我也有这个基类的几个实现。但是当我尝试使用nvcc 7.0进行编译时，我收到此错误消息，指的是我在kernel中定义函数AbstractScheme的行（上面列表中的最后一行）：

myfile.cu(60): error: illegal combination of memory qualifiers

我找不到任何资源说内核不能是虚函数，但我觉得这是问题所在。你能解释一下这背后的理由吗？我清楚地理解__device__函数如何以及为什么不能成为虚函数（虚函数是指向存储在对象中的实际[host]函数的指针，你不能从设备代码中调用这样的函数），但我我不确定__global__函数。

编辑：我打出的问题部分是错误的。请查看评论以了解原因。

Answer 1

内核不能是CUDA对象模型中的类的成员，无论是否为虚拟。这是编译错误的原因，即使编译器发出的错误消息不是特别明显。

CUDA内核可以是虚函数吗？

1 个答案: