多线程MKL cblas_sgemm,g ++错误

时间:2016-06-30 08:40:20

标签: c++ multithreading gcc intel-mkl

这是sgemm程序的一个例子

#include <mkl.h>
#include <iostream>
#include <cstdlib>
#define ITERATION 1

int main()
{
  int ra = 128;
  int lda = 75;
  int ldb = 55;
  float* left = (float*)calloc(ra * lda, sizeof(float));
  float* right = (float*)calloc(ldb * lda, sizeof(float));
  float* ans = (float*)calloc(ra * ldb, sizeof(float));
  std::cout << "left " << std::endl;
  for (int i = 0; i < ra; ++i) {
    for (int j = 0; j < lda; ++j) {
      left[i * lda + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
      std::cout << left[i * lda + j] << " ";
    }
    std::cout << std::endl;
  }

  std::cout << "right " << std::endl;
  for (int i = 0; i < lda; ++i) {
    for (int j = 0; j < ldb; ++j) {
      right[i * ldb + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
      std::cout << right[i * ldb + j] << " ";
    }
    std::cout << std::endl;
  }

  for (int i = 0; i < ITERATION; ++i) {
    cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, ra, ldb, lda, 1.0f, left, lda,
      right, ldb, 0.0f, ans, ldb);
  }

  std::cout << "ans " << std::endl;
  for (int i = 0; i < ra; ++i) {
    for (int j = 0; j < ldb; ++j) {
      std::cout << ans[i * ldb + j] << " ";
    }
    std::cout << std::endl;
  }

  return 0;
}

我使用g ++按选项-fopenmp -lmkl_rt编译此程序,其中OMP_NUM_THREADS已设置为16.

运行程序后,我发现与matlab结果相比,答案是完全错误的。如果准确度误差很少,我就不会说错。此外,我观察到程序在这些条件下表现良好:

  1. 使用icc而不是g ++,
  2. 删除-fopenmp标志,
  3. 使用g ++&amp; atlas而不是icc&amp; mkl
  4. 设置OMP_NUM_THREADS = 1
  5. 因此,我猜问题可能在于-fopenmp标志。你能帮我解决一下这个问题吗?谢谢!

      

    g ++(GCC)4.4.7 20120313(Red Hat 4.4.7-16)

         

    icc(ICC)16.0.3 20160415

         

    Linux核心2.6.32-279.el6.x86_64

1 个答案:

答案 0 :(得分:0)

根据MKL link line advisor,您无需将-fopenmp与单个动态库-lmkl_rt一起使用即可启用多线程。由于您的gcc已久,这可能是个问题。

您可以尝试使用传统动态链接并比较以下设置,以查看问题所在。

线程MKL + GNU OpenMP

Link options: -Wl,--no-as-needed -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lpthread -lm -ldl           
Compile options: -fopenmp -m64 -I${MKLROOT}/include

线程MKL + Intel OpenMP

Link options: -Wl,--no-as-needed -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_intel_thread -liomp5 -lpthread -lm -ldl
Compile options: -m64 -I${MKLROOT}/include