Question

想象一下，我有一个对象数组，如下所示：

class Segment {
public:
    float x1, x2, y1, y2;
}

Segment* SegmentList[100];

基于这个Segments数组，我想快速提取其属性并创建包含所有x1，x2，y1和y2的向量，就像那样;

float x1List[100];

for(int=0; i<100; i++) {
    x1List[i] = SegmentList[i]->x1;
}

我想知道是否有更快的方法来阅读所有＆＃34; x1＆＃34;属性到数组中。

更新1：

由于我将使用此数组加载到AVX寄存器中，我可以将我的问题改为：

＆＃34;是否有更快的方法将对象数组的属性加载到AVX寄存器中？＆＃34;

Answer 1

我想提出一个不同的观点。我的贡献实际上只是将@PeterCordes的一条评论扩展到原始帖子之下，但是作为对该评论的回复发布时间太长。

我最近刚刚优化了我的旧模拟。我有一个与你相似的问题。我正在模拟粒子的运动，我的程序依赖于大致相同的结构：

void make_one_step(std::array<Particle, 4>& p, double time_step)
{
  __m256d pos_x = _mm256_set_pd(p[3].x, p[2].x, p[1].x, p[0].x);
  // do similar for y and z component of position and for all velocities

  // compute stuff (bottleneck)
  // __m256d new_pos_x = ...

  // do this for every component of velocity and position
  double vals[4];
  _mm256_store_pd(vals, new_pos_x);
  for(int i = 0; i < 4; ++i) p[i].x = vals[i]; 
}

void simulate_movement(std::array<Particle, 4>& p)
{
  for( ... lots of steps ...)
  {
    make_one_step(p, time_step); // bottleneck
    // check values of some components and do some cheap operations
  }
}

我知道单个粒子的模拟是彼此独立的，所以我决定使用SIMD并行性来更快地进行模拟。如果我继续依赖粒子结构，我必须像这样将速度和位置的每个分量加载到AVX寄存器中。 为了简单起见，我假装我的数组只包含四个粒子，但事实上我正在处理一个包含数千个粒子的堆数组：

struct Quadruple
{
  double pos_x[4]; 
  // and similar for other position/velocity components
}

说实话，我必须在模拟中计算这么多东西（一些相对先进的物理学），加载和存储根本不是瓶颈。但是，对程序的每一步重新包装的丑陋给了我额外的动力来解决问题。 我没有更改算法的输入（我仍然使用粒子对象），但在算法本身内部，我重新组合了四个粒子中的数据并将其存储在如下结构中：

void make_one_step(Quadruple& p, double time_step)
{
  __m256d pos_x = _mm256_load_pd(p.pos_x); // for every component

  // compute stuff in same way as before

  _mm256_store_pd(p.pos_x, new_pos_x); // for every component
}

void simulate_movement(std::Array<Particle, 4> &particles, double time_step)
{
   //Quadruple q = ... // store data in Quadruple

   for( ... a lot of time steps ... )
   {
     make_one_step(q, time_step); //bottleneck
    // check values of some components and do some cheap operations
   }

  // get data from quadruple and store it in an array of particles
}

在这些变化之后，模拟看起来就是这样。长话短说，我刚修改了算法和界面之间的层。高效装载？校验。输入数据不变？校验。

{{1}}

不幸的是，我无法判断这是否对您有所帮助;这取决于你做了什么。如果在开始计算之前需要数组中的所有数据，我的建议对你没有帮助，如果重组数据本身就是一个瓶颈，那么同样没用。：）祝好运。

如何从对象数组中提取属性数组？

1 个答案: