Question

我读in this question eigen表现非常出色。但是，我尝试比较eigen MatrixXi乘法速度与numpy array乘法。并且numpy表现更好（~26秒对比~29）。有没有更有效的方法来执行此操作eigen？

这是我的代码：

numpy的：

import numpy as np
import time

n_a_rows = 4000
n_a_cols = 3000
n_b_rows = n_a_cols
n_b_cols = 200

a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)

start = time.time()
d = np.dot(a, b)
end = time.time()

print "time taken : {}".format(end - start)

结果：

time taken : 25.9291000366

征：

#include <iostream>
#include <Eigen/Dense>
using namespace Eigen;
int main()
{

  int n_a_rows = 4000;
  int n_a_cols = 3000;
  int n_b_rows = n_a_cols;
  int n_b_cols = 200;

  MatrixXi a(n_a_rows, n_a_cols);

  for (int i = 0; i < n_a_rows; ++ i)
      for (int j = 0; j < n_a_cols; ++ j)
        a (i, j) = n_a_cols * i + j;

  MatrixXi b (n_b_rows, n_b_cols);
  for (int i = 0; i < n_b_rows; ++ i)
      for (int j = 0; j < n_b_cols; ++ j)
        b (i, j) = n_b_cols * i + j;

  MatrixXi d (n_a_rows, n_b_cols);

  clock_t begin = clock();

  d = a * b;

  clock_t end = clock();
  double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
  std::cout << "Time taken : " << elapsed_secs << std::endl;

}

结果：

Time taken : 29.05

我正在使用numpy 1.8.1和eigen 3.2.0-4。

Answer 1

变化：

a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)

成：

a = np.arange(n_a_rows * n_a_cols).reshape(n_a_rows, n_a_cols)*1.0
b = np.arange(n_b_rows * n_b_cols).reshape(n_b_rows, n_b_cols)*1.0

这至少在我的笔记本电脑上提供了100倍的提升：

time taken : 11.1231250763

VS

time taken : 0.124922037125

除非你真的想要乘以整数。在Eigen中，乘以双精度数也是更快（相当于将MatrixXi替换为MatrixXd三次），但我只看到1.5因子：所用时间：0.555005 vs 0.846788。

Answer 2

@Jitse Niesen和@ggael在评论中回答了我的问题。

我需要添加一个标志来在编译时启用优化：-O2 -DNDEBUG（O是大写o，而不是零）。

包含此标记后，eigen代码将在0.6秒内运行，而不会在~29秒内运行。

特征矩阵与Numpy数组乘法性能

2 个答案: