Question

我正在使用CUDA C / C ++，我的CUDA编译工具版本是7.0

我有一个结构和一个类：

struct racer{
    bool active;
    int Distance;
    int currentPosition;
};

班级：

class Game{
    public:
    vector <racer> racersVector;
    bool           runTimeStep();
}

我有一个修改“racersVector”的类函数：

bool Game::runTimeStep(){

    //this is 1 timestep, this is the part of code to be run on the GPU with "racersVector.size()" blocks/threads in parallel
    //-----------------------
    for (int j = 0; j < racersVector.size(); j++){
        racersVector[j].currentPosition++;

        if (racersVector[j].currentPosition >= racersVector[j].Distance)
            racersVector[j].active = false;
    }
    //-----------------------

}

所以，从我的Main，我以这种方式使用这个类：

Game game1;
game1.initialise();

while(true){
    game1.runTimeStep();
}

我正在尝试将CUDA用于评论的代码部分，其目的是将类对象或“vector racersVector”实例复制到设备，运行“computeTimeStep”（我想要的CUDA内核）实现）我想要多次，然后，当我想看到我的矢量的状态时，将矢量从设备复制回主机，所以理想的是这样的：

Game game1;
game1.initialise();

here-the-code-to-copy-game1.racersVector-to-device

computeTimeStep <<<N,1>>> ();
computeTimeStep <<<N,1>>> ();
computeTimeStep <<<N,1>>> ();
computeTimeStep <<<N,1>>> ();
computeTimeStep <<<N,1>>> ();
copyBackToHost (game1.racersVector);
game1.printInfo();

所以我修改了我的主程序：

int main() 
{
    Game game1;
    game1.initialise();

    //trying to copy game1.racersVector to device
    vector<racer> *d_vec;
    cudaMalloc((void **)&d_vec, sizeof(game1.racersVector));
    cudaMemcpy(d_vec, &game1.racersVector, sizeof(game1.racersVector), cudaMemcpyHostToDevice);

如果我理解正确，这应该将“game1.racersVector”复制到设备

我的想法是创建一个CUDA函数（内核），使用“vector racersVector”执行1次步骤，但是当我尝试创建一个以向量指针作为参数的CUDA内核时：

__global__ void computeTimeStep (vector<racer> *cud){

    cud->resize(4);
}

nvcc说：

cudaex2.cu(46): error: calling a __host__ function("std::vector<racer, std::allocator<racer> > ::resize") from a __global__ function("computeStep") is not allowed

如何将“racersVector”复制到设备，然后使用CUDA内核处理该向量？

CUDA C ++，修改一个函数，使其从设备运行？

0 个答案: