Question

我正在构建一个距离矩阵，每行代表一个点，每列是该点与数据中所有其他点之间的距离，我的算法在顺序中工作得很好。但是，当我尝试并行化时，我得到了分段错误错误。以下是我的并行代码，其中dat是包含我所有数据的映射。任何帮助都将受到高度赞赏。

map< int,string >::iterator datIt;
map< int,string >::iterator datIt2;
map <int, map< int, double> > dist;
int mycont=0;
datIt=dat.begin();
int size=dat.size();
#pragma omp  parallel //construct the distance matrix
{   
  #pragma omp for   
  for(int i=0;i<size;i++)
  {
    datIt2=dat.find((*datIt).first);
    datIt2++;
    while(datIt2!=dat.end())
    {
      double ecl=0;
      int c=count((*datIt).second.begin(),(*datIt).second.end(),delm)+1;
      string line1=(*datIt).second;
      string line2=(*datIt2).second;
      for (int i=0;i<c;i++)
      {
        double num1=atof(line1.substr(0,line1.find_first_of(delm)).c_str());
        line1=line1.substr(line1.find_first_of(delm)+1).c_str();
        double num2=atof(line2.substr(0,line2.find_first_of(delm)).c_str());
        line2=line2.substr(line2.find_first_of(delm)+1).c_str();
        ecl += (num1-num2)*(num1-num2);
      }
      ecl=sqrt(ecl);
      dist[(*datIt).first][(*datIt2).first]=ecl;
      dist[(*datIt2).first][(*datIt).first]=ecl;
      datIt2++;
    }
    datIt++;
  }
}

Answer 1

我不确定它是否是代码的唯一问题，但标准容器（例如std::map）不是线程安全的，至少如果你写它们。因此，如果您对maps具有任何写入权限，例如dist[(*datIt).first][(*datIt2).first]=ecl;，则需要使用#pragm omp critical或{{1}包装对某种同步结构中的地图的任何访问权限}（omp mutex或者，如果你使用boost或C ++ 11 mutexes或boost::mutex也是选项）：

std::mutex

由于你只读//before the parallel: omp_lock_t lock; omp_init_lock(&lock); ... omp_set_lock(&lock); dist[(*datIt).first][(*datIt2).first]=ecl; dist[(*datIt2).first][(*datIt).first]=ecl; omp_unset_lock(&lock); ... //after the parallel: omp_destroy_lock(&lock);它没有同步化就好了（至少在C ++ 11中，C ++ 03无法保证线程安全（因为它没有线程的概念）。它应该通常可以在没有同步的情况下使用它，但从技术上讲，它依赖于实现。

此外，由于您未指定数据共享，因此默认情况下共享在dat区域外声明的所有变量。因此，您对parallel和datIt的写入权限也会出现竞争条件。对于datIt2，可以通过将其指定为私有来避免，或者甚至更好地在首次使用时声明它：

datIt2

为map< int,string >::iterator datIt2=dat.find((*datIt).first);解决这个问题有点困难，因为看起来你想迭代地图的整个长度。最简单的方法（通过使用datIt推进每次迭代不会过于昂贵）似乎在O(n)的私有副本上运行，该副本相应地提前（不保证100％正确性，只是快速概述）：

datIt

这样地图迭代#pragma omp parallel //construct the distance matrix { map< int,string >::iterator datItLocal=datIt; int lastIdx = 0; for(int i=0;i<size;i++) { std::advance(datItLocal, i - lastIdx); lastIdx = i; //use datItLocal instead of datIt everytime you reference datIt in the parallel //remove ++datIt } }次，但它应该有效。如果这对您来说是一个不可接受的开销，请查看this answer of mine以获取在openmp中omp_get_num_threads()上循环的替代解决方案。

作为旁注：也许我错过了一些东西，但对我而言，似乎bidirectional iterator是datIt的迭代器，dat有点冗余。地图中只有一个带有给定键的元素，而dat.find(datIt->first)指向它，所以这似乎是一种昂贵的说法datIt（如果我错了，请纠正我）。

Answer 2

除了Girzzly的回答，您不会指定任何私人或共享内容。这通常意味着您无法控制内存的访问。当你进入并行区域时，你必须定义datIt firstprivate和datIt2 private，否则everythread会覆盖它们的共享值，这会导致段错误。

而不是使用锁，我会使用一个更为开放的关键部分。

分段故障openmp错误

2 个答案: