Question

#pragma omp parallel for private(x,y)
for (int j = 0; j < nDstSizeY; j++)
{

   for (int i = 0; i < nDstSizeX; i++){

        x = MapX.at<float>(j, i);
        y = MapY.at<float>(j, i);

        if (nSrcType == CV_8UC1)
        {
            Dst.at<uchar>(j, i) = Bilinear8UC1(Src, x, y);
        }
        else
        {
            Dst.at<Vec3b>(j, i) = Bilinear8UC3(Src, x, y);
        }
    }
}

我想要make代码到tbb但是在局部变量问题（openmp上的private（x，y））我的程序不能更快地运行我的tbb代码就像这样

tbb::parallel_for(0, nDstSizeY, [&](int j){
    for (int i = 0; i < nDstSizeX; i++)
    {
        x = MapX.at<float>(j, i);
        y = MapY.at<float>(j, i);

        if (nSrcType == CV_8UC1)
        {
            Dst.at<uchar>(j, i) = Bilinear8UC1(Src, x, y);
        }
        else
        {
            Dst.at<Vec3b>(j, i) = Bilinear8UC3(Src, x, y);
        }
    }
});

我该如何解决？抱歉我的英文不好

Answer 1

由于x和y由于[&]在线程之间共享，因此对TBB的此转换不一致。如果您希望在转换为TBB时保持private(x,y)，请将其明确添加到lambda捕获：

[&,x,y](int j)

或者只声明局部变量x＆amp; lambda里面y。否则，它会导致共享x＆amp;上的数据竞争。 y。

另一个建议是使用blocked_range2d，这可能会启用一些额外的缓存优化：

tbb::parallel_for( tbb::blocked_range2d<int>(0, nDstSizeY, 0, nDstSizeX)
                 , [&](tbb::blocked_range2d<int> r) {
  for( int j = r.rows().begin(); j < r.rows().end(); j++ )
      for( int i = r.cols().begin(); i < r.cols().end(); i++ ) {
          int x = MapX.at<float>(j, i);
          int y = MapY.at<float>(j, i); // note: locally declared variables

          if (nSrcType == CV_8UC1)
              Dst.at<uchar>(j, i) = Bilinear8UC1(Src, x, y);
          else
              Dst.at<Vec3b>(j, i) = Bilinear8UC3(Src, x, y);
      }
});

将openmp改为tbb

1 个答案: