Question

我正在创建一个浮点矩阵模板类。类声明仅在下面显示相关功能和成员。

// columns, rows
template <unsigned int c, unsigned int r>
class Matrix {
 public:
  Matrix(float value);

  float& At(unsigned int x, unsigned int y);
  float const& At(unsigned int x, unsigned int y) const;
  template <unsigned int p> Matrix<p, r> MultipliedBy(Matrix<p, c> const& other);

 private:
  // column-major ordering
  float data_[c][r];
}

以上每个功能的实现如下。

template <unsigned int c, unsigned int r>
Matrix<c, r>::Matrix(float value) {
  std::fill(&data_[0][0], &data_[c][r], value);
}

template <unsigned int c, unsigned int r>
float& Matrix<c, r>::At(unsigned int x, unsigned int y) {
  if (x >= c || y >= r) {
    return data_[0][0];
  }

  return data_[x][y];
}

template <unsigned int c, unsigned int r>
float const& Matrix<c, r>::At(unsigned int x, unsigned int y) const {
  if (x >= c || y >= r) {
    return data_[0][0];
  }

  return data_[x][y];
}

template <unsigned int c, unsigned int r>
template <unsigned int p>
Matrix<p, r> Matrix<c, r>::MultipliedBy(Matrix<p, c> const& other) {
  Matrix<p, r> result(0.0f);

  for (unsigned int x = 0; x < c; x++) {
    for (unsigned int y = 0; y < r; y++) {
      for (unsigned int z = 0; z < p; z++) {
        result.At(z, y) += At(x, y) * other.At(z, x);
      }
    }
  }

  return result;
}

现在，几行测试代码。

Matrix<4, 4> m1;

// m1 set to
//
//  1   2   3   4
//  5   6   7   8
//  9   10  11  12
//  13  14  15  16

Matrix<1, 4> m2;

// m2 set to
//
//  6
//  3
//  8
//  9

Matrix<1, 4> m3 = m1.MultipliedBy(m2);

这是事情变得奇怪的地方。编译时（使用g++）没有优化（-O0）：

// m3 contains
//  0
//  0
//  0
//  0

进行任何优化（-O1，-O2或-O3）：

// m3 contains
//  210
//  236
//  262
//  288

请注意，即使使用优化，答案也是错误的（使用外部计算器验证）。所以我把它缩小到MultipliedBy中的这个电话：

Matrix<p, r> result(0.0f);

如果我以任何方式实例化result other变为无效（所有data_值设置为0.0f）。在result的分配/初始化之前，other仍然有效（6, 3, 8, 9）。

值得注意的是，如果我将两个相同（方形）维度的矩阵相乘，无论优化级别如何，我都会获得完全有效且正确的输出。

任何人都知道世界上g++正在拉什么？我在g++ (GCC) 4.6.1上运行mingw ...这可能与此问题有关吗？

Answer 1

&data_[c][r]可能是错误的：它是data_ + (c*r + r) * FS，而您可能需要&data_[c-1][r-1] + FS，即data_ + ((c-1)*r + (r-1) + 1) * FS，即data_ + c*r * FS。

（这里FS == sizeof(float)。）

您的上一个项目是data_[c-1][r-1]，因此最后一项是data_[c-1][r]，而不是data_[c][r]。

如果没有g ++优化，复杂的模板类函数就会失败

1 个答案: