我正在使用CUDA C / C ++,我的CUDA编译工具版本是7.0
我有一个结构和一个类:
struct racer{
bool active;
int Distance;
int currentPosition;
};
班级:
class Game{
public:
vector <racer> racersVector;
bool runTimeStep();
}
我有一个修改“racersVector”的类函数:
bool Game::runTimeStep(){
//this is 1 timestep, this is the part of code to be run on the GPU with "racersVector.size()" blocks/threads in parallel
//-----------------------
for (int j = 0; j < racersVector.size(); j++){
racersVector[j].currentPosition++;
if (racersVector[j].currentPosition >= racersVector[j].Distance)
racersVector[j].active = false;
}
//-----------------------
}
所以,从我的Main,我以这种方式使用这个类:
Game game1;
game1.initialise();
while(true){
game1.runTimeStep();
}
我正在尝试将CUDA用于评论的代码部分,其目的是将类对象或“vector racersVector”实例复制到设备,运行“computeTimeStep”(我想要的CUDA内核)实现)我想要多次,然后,当我想看到我的矢量的状态时,将矢量从设备复制回主机,所以理想的是这样的:
Game game1;
game1.initialise();
here-the-code-to-copy-game1.racersVector-to-device
computeTimeStep <<<N,1>>> ();
computeTimeStep <<<N,1>>> ();
computeTimeStep <<<N,1>>> ();
computeTimeStep <<<N,1>>> ();
computeTimeStep <<<N,1>>> ();
copyBackToHost (game1.racersVector);
game1.printInfo();
所以我修改了我的主程序:
int main()
{
Game game1;
game1.initialise();
//trying to copy game1.racersVector to device
vector<racer> *d_vec;
cudaMalloc((void **)&d_vec, sizeof(game1.racersVector));
cudaMemcpy(d_vec, &game1.racersVector, sizeof(game1.racersVector), cudaMemcpyHostToDevice);
如果我理解正确,这应该将“game1.racersVector”复制到设备
我的想法是创建一个CUDA函数(内核),使用“vector racersVector”执行1次步骤,但是当我尝试创建一个以向量指针作为参数的CUDA内核时:
__global__ void computeTimeStep (vector<racer> *cud){
cud->resize(4);
}
nvcc说:
cudaex2.cu(46): error: calling a __host__ function("std::vector<racer, std::allocator<racer> > ::resize") from a __global__ function("computeStep") is not allowed
如何将“racersVector”复制到设备,然后使用CUDA内核处理该向量?