沿多维数组的任意轴减少(和)

时间:2018-04-18 14:10:32

标签: c++ algorithm

我想沿多维矩阵的任意轴执行和减少,该多维矩阵可以具有任意尺寸(例如10维阵列的轴5)。矩阵使用行主格式存储,即与vector一起沿着每个轴的步幅存储。

我知道如何使用嵌套循环执行此缩减(请参见下面的示例),但这样做会导致硬编码轴(缩减沿着下面的轴1)和任意数量的维度(下面的4)。 如何在不使用嵌套循环的情况下概括它?

#include <iostream>
#include <vector>

int main()
{
  // shape, stride & data of the matrix

  size_t shape  [] = { 2, 3, 4, 5};
  size_t strides[] = {60,20, 5, 1};

  std::vector<double> data(2*3*4*5);

  for ( size_t i = 0 ; i < data.size() ; ++i ) data[i] = 1.;

  // shape, stride & data (zero-initialized) of the reduced matrix

  size_t rshape  [] = { 2, 4, 5};
  size_t rstrides[] = {20, 5, 1};

  std::vector<double> rdata(2*4*5, 0.0);

  // compute reduction

  for ( size_t a = 0 ; a < shape[0] ; ++a )
    for ( size_t c = 0 ; c < shape[2] ; ++c )
      for ( size_t d = 0 ; d < shape[3] ; ++d )
        for ( size_t b = 0 ; b < shape[1] ; ++b )
          rdata[ a*rstrides[0]                 + c*rstrides[1] + d*rstrides[2] ] += \
          data [ a*strides [0] + b*strides [1] + c*strides [2] + d*strides [3] ];

  // print resulting reduced matrix

  for ( size_t a = 0 ; a < rshape[0] ; ++a )
    for ( size_t b = 0 ; b < rshape[1] ; ++b )
      for ( size_t c = 0 ; c < rshape[2] ; ++c )
        std::cout << "(" << a << "," << b << "," << c << ") " << \
        rdata[ a*rstrides[0] + b*rstrides[1] + c*rstrides[2] ] << std::endl;

  return 0;
}

注意:我想避免'解压缩'和'压缩'计数器。我的意思是,我可以用伪代码做:

for ( size_t i = 0 ; i < data.size() ; ++i ) 
{
  i -> {a,b,c,d}

  discard "b" (axis 1) -> {a,c,d}

  rdata(a,c,d) += data(a,b,c,d)
}

2 个答案:

答案 0 :(得分:3)

我不知道这段代码的效率如何,但在我看来,它肯定是准确的。

发生了什么事?

adjusted_strides上的一点点:

对于axis_count = 4adjusted_strides的大小为5,其中:

 adjusted_strides[0] = shape[0]*shape[1]*shape[2]*shape[3];
 adjusted_strides[1] = shape[1]*shape[2]*shape[3];
 adjusted_strides[2] = shape[2]*shape[3];
 adjusted_strides[3] = shape[3];
 adjusted_strides[4] = 1;

我们以维数为4且多维数组(A)的形状为n0, n1, n2, n3为例。

当我们需要将此数组转换为另一个形状为B(压缩n0, n2, n3)的多维数组(axis = 1 (0-based))时,我们尝试按以下步骤进行:

对于A的每个索引,我们会尝试在B中找到它的位置。 让A[i][j][k][l]成为A中的任何元素。它在flat_A中的位置将为A[i*n1*n2*n3 + j*n2*n3 + k*n3 + l]

<强> idx = i*n1*n2*n3 + j*n2*n3 + k*n3 + l;

在压缩数组B中,此元素将成为B[i][k][l]的一部分(或添加到其中)flat_B。在new_idx = i*n2*n3 + k*n3 + l;中,索引为 new_idx

我们如何从idx形成1

  1. 压缩轴之前的所有轴都具有压缩轴的形状作为其产品的一部分。在我们的示例中,我们必须删除轴0th axis,因此i表示的所有轴在第1轴之前(此处只有一个:n1)具有{{1} }作为产品的一部分(i*n1*n2*n3)。

  2. 压缩轴后的所有轴都不受影响。

  3. 最后,我们需要做两件事:

    1. 在要压缩的轴的索引之前隔离轴的索引,并删除该轴的形状:

      整数除法idx / (n1*n2*n3);== idx / adjusted_strides[1])。

      我们只剩i,可以根据新形状重新调整(乘以n2*n3):我们得到

      i*n2*n3== i * adjusted_strides[2])。

    2. 我们在压缩轴之后隔离轴,这些轴不受其形状的影响。

      idx % (n2*n3)== idx % adjusted_strides[2]

      给了我们k*n3 + l

    3. 添加步骤 i。 ii。的结果会导致:

      computed_idx = i*n2*n3 + k*n3 + l;

      new_idx相同。所以,我们的转变是正确的:)。

  4. 代码:

    注意:ni是指new_idx

      size_t cmp_axis = 1, axis_count = sizeof shape/ sizeof *shape;
      std::vector<size_t> adjusted_strides;
      //adjusted strides is basically same as strides
      //only difference being that the first element is the 
      //total number of elements in the n dim array.
    
      //The only reason to introduce this array was
      //so that I don't have to write any if-elses
      adjusted_strides.push_back(shape[0]*strides[0]);
      adjusted_strides.insert(adjusted_strides.end(), strides, strides + axis_count);
      for(size_t i = 0; i < data.size(); ++i) {
        size_t ni = i/adjusted_strides[cmp_axis]*adjusted_strides[cmp_axis+1] + i%adjusted_strides[cmp_axis+1];
        rdata[ni] += data[i];
      }
    

    输出(轴= 1)

    (0,0,0) 3
    (0,0,1) 3
    (0,0,2) 3
    (0,0,3) 3
    (0,0,4) 3
    (0,1,0) 3
    (0,1,1) 3
    (0,1,2) 3
    (0,1,3) 3
    (0,1,4) 3
    (0,2,0) 3
    (0,2,1) 3
    (0,2,2) 3
    (0,2,3) 3
    (0,2,4) 3
    (0,3,0) 3
    (0,3,1) 3
    (0,3,2) 3
    ...
    

    经过测试here

    如需进一步阅读,请参阅this

答案 1 :(得分:1)

我认为这应该有效:

#include <iostream>
#include <vector>

int main()
{
  // shape, stride & data of the matrix
  size_t shape  [] = {  2, 3, 4, 5};
  size_t strides[] = {60, 20, 5, 1};
  std::vector<double> data(2 * 3 * 4 * 5);

  size_t rshape  [] = { 2, 4, 5};
  size_t rstrides[] = {3, 5, 1};
  std::vector<double> rdata(2 * 4 * 5, 0.0);

  const unsigned int NDIM = 4;
  unsigned int axis = 1;

  for (size_t i = 0 ; i < data.size() ; ++i) data[i] = 1;

  // How many elements to advance after each reduction
  size_t step_axis = strides[NDIM - 1];
  if (axis == NDIM - 1)
  {
      step_axis = strides[NDIM - 2];
  }
  // Position of the first element of the current reduction
  size_t offset_base = 0;
  size_t offset = 0;
  size_t s = 0;
  for (auto &v : rdata)
  {
      // Current reduced element
      size_t offset_i = offset;
      for (unsigned int i = 0; i < shape[axis]; i++)
      {
          // Reduce
          v += *(data.data() + offset_i);
          // Advance to next element
          offset_i += strides[axis];
      }
      s = (s + 1) % strides[axis];
      if (s == 0)
      {
          offset_base += strides[axis - 1];
          offset = offset_base;
      }
      else
      {
          offset += step_axis;
      }
  }

  // Print
  for ( size_t a = 0 ; a < rshape[0] ; ++a )
    for ( size_t b = 0 ; b < rshape[1] ; ++b )
      for ( size_t c = 0 ; c < rshape[2] ; ++c )
        std::cout << "(" << a << "," << b << "," << c << ") " << \
        rdata[ a*rstrides[0] + b*rstrides[1] + c*rstrides[2] ] << std::endl;

  return 0;
}

输出:

(0,0,0) 3
(0,0,1) 3
(0,0,2) 3
(0,0,3) 3
(0,0,4) 3
(0,1,0) 3
(0,1,1) 3
(0,1,2) 3
(0,1,3) 3
(0,1,4) 3
(0,2,0) 3
(0,2,1) 3
(0,2,2) 3
// ...

设置axis = 3会产生:

(0,0,0) 5
(0,0,1) 5
(0,0,2) 5
(0,0,3) 5
(0,0,4) 5
(0,1,0) 5
(0,1,1) 5
(0,1,2) 5
(0,1,3) 5
(0,1,4) 5
(0,2,0) 5
(0,2,1) 5
(0,2,2) 5
(0,2,3) 5
// ...