Question

我有一个矩阵乘法码，它通过以下矩阵乘以矩阵矩阵A *矩阵B =矩阵C

for(j=1;j<=n;j++) {
 for(l=1;l<=k;l++) {
  for(i=1;i<=m;i++) {
   C[i][j] = C[i][j] + B[l][j]*A[i][l];

 }
}

现在我想把它变成多线程矩阵乘法，我的代码如下：

我使用结构

struct ij
{
 int rows;
 int columns;
};

我的方法是

void *MultiplyByThread(void *t)
{
 struct ij *RowsAndColumns = t;
 double total=0; 
 int pos; 
 for(pos = 1;pos<k;pos++)
 {
  fprintf(stdout, "Current Total For: %10.2f",total);
  fprintf(stdout, "%d\n\n",pos);
  total += (A[RowsAndColumns->rows][pos])*(B[pos][RowsAndColumns->columns]);
 }
 D[RowsAndColumns->rows][RowsAndColumns->columns] = total;
 pthread_exit(0);

}

在我的主要内部是

      for(i=1;i<=m;i++) {
        for(j=1;j<=n;j++) {

   struct ij *t = (struct ij *) malloc(sizeof(struct ij));
   t->rows = i;
   t->columns = j;

    pthread_t thread;
    pthread_attr_t threadAttr;
    pthread_attr_init(&threadAttr);
    pthread_create(&thread, &threadAttr, MultiplyByThread, t);    
    pthread_join(thread, NULL);    

        }
      }

但我似乎无法获得与第一个矩阵乘法相同的结果（这是正确的）有人能指出我正确的方向吗？

Answer 1

尝试以下方法：

#pragma omp for private(i, l, j)
for(j=1;j<=n;j++) {
    for(l=1;l<=k;l++) {
        for(i=1;i<=m;i++) {
            C[i][j] = C[i][j] + B[l][j]*A[i][l];
        }
    }
}

当谷歌搜索GCC编译器切换到启用OpenMP时，我实际上遇到this blog post描述了比我更好的事情，并且还包含一个更好的例子。

多核计算机的大多数合理相关编译器都支持OpenMP，有关详细信息，请参阅OpenMP web site。

Answer 2

事实上，您的线程代码没有线程化。您可以通过在调用create之后调用join来创建一个线程并等待它完成。您必须创建一个mxn线程矩阵，将它们全部启动，然后将它们全部加入。除此之外，代码似乎与循环计算相同。与结果的确切差异是什么？

示例（注意，未编译）：

pthread_t threads[m][n]; /* Threads that will execute in parallel */

然后在主要：

 for(i=1;i<=m;i++) {
    for(j=1;j<=n;j++) {

    struct ij *t = (struct ij *) malloc(sizeof(struct ij));
    t->rows = i;
    t->columns = j;

    pthread_attr_t threadAttr;
    pthread_attr_init(&threadAttr);
    pthread_create(thread[i][j], &threadAttr, MultiplyByThread, t);    
    }
  }

  /* join all the threads */
  for(i=1;i<=m;i++) {
    for(j=1;j<=n;j++) {
       pthread_join(thread[i][j], NULL);
    }
  }

（或多或少，只是不为循环中的每个线程调用pthread_join。）

C中的多线程矩阵乘法帮助

2 个答案: