Question

我对CUDA中的一个奇怪行为有疑问。我目前正在开发关于粒子轨迹的蒙特卡罗模拟，我正在做以下事情。

我的粒子在给定日期t（n）的位置p（n）取决于我的粒子在前一个日期t（n-1）的位置t（n-1）。实际上，假设值v（n）是从值p（n-1）计算的。以下是我的代码的简化示例：

__device__ inline double calculateStep( double drift, double vol, double dt, double randomWalk, double S_t){
  return exp((drift - vol*vol*0.5)*dt + randomWalk*vol*sqrt(dt))*S_t;
}    

__device__ double doSomethingWhith(double v_n, ….) {
  ...
  Return v_n*exp(t)*S
}



__global__ myMCsimulation( double* matrice, double * randomWalk, int nbreSimulation, int nPaths, double drift, ……) {


  double dt = T/nPaths;
  unsigned int tid = threadIdx.x + blockDim.x * blockIdx.x; 
  unsigned int stride = blockDim.x*gridDim.x;
  unsigned int index = tid;  
  double mydt = (index - nbreSimulation)/nbreSimulation*dt + dt;

  for ( index = tid; index < nbreSimulation*nPaths; index += stride) {
    if (index >= nbreSimulation)
    {
     double v_n = DoSomethingWith(drift,dt, matrice[index – nbreSimulation]);
     matrice[index] = matrice[index - nbreSimulation ] * calculateStep(drift,v_n,dt,randomWalk[index]); // 
    }
...}

最后一行代码：

matrice[index] = matrice[index - nbreSimulation ] * calculateStep(drift,v_n,dt,randomWalk[index]);

使我能够只填写矩阵矩阵的第二行。我不知道为什么。

当我通过以下方式更改代码行时：

matrice[index] =  DoSomethingWith(drift,dt, matrice[index – nbreSimulation]);

我的矩阵填写完整，我的所有值都已更改，然后我就可以取回matrice[index – nbreSimulation]。我认为这是一个并发访问，但我不确定，我尝试了__syncthreads()但它没有用。

有人可以帮忙解决这个问题吗？

非常感谢

Answer 1

我已经通过以下方式更改了我的代码，现在它完美无缺。

if (index < nbreSimulation) {
            matrice[index] = S0;    
            for (workingCol=1; workingCol< nPaths; workingCol++) {
                previousMove = index; 
                index = index + nbreSimulation;
                  ................
                matrice[index] = calculateStep(drift,vol_int[index],dt,randomWalk[index], matrice[previousMove]);             }
       }
    }

Answer 2

我尝试过以下内容：

我已经声明了一个共享变量（一个双精度数组），它包含每次迭代时计算的值：

__shared__ double mat[];

......
for ( index = tid; index < nbreSimulation*nPaths; index += stride) {
   .....
  mat[index] = computedValue;
   ......
 }

没有成功。有没有人看到这个问题？

CUDA和蒙特卡罗定义了本地行为

2 个答案: