拆分由扭曲数组组成的数组

时间:2017-10-13 15:26:25

标签: c# .net arrays performance

我有一个传感器向我发送一个1维浮点数组,我必须拆分成4个不同的子数组。我的数组代表一个由1024个斜坡组成的帧。每个Ramp都有4个通道的标题和数据(我要分割的数据)。每个通道有2个浮子,一个用于实部,一个用于复杂部件。为了澄清这一点,我附上了一个结构图像:

enter image description here

我需要在4个数组中解开这个大数组,只有数据,每个数据用于单个通道。这必须快速完成。我的实现大约需要850ms,但遗​​憾的是这还不够快。到目前为止,我已经编写了下一个代码:

IntPtr ptr = (IntPtr)frameInfo.ptr; // The pointer to the buffer

for (int i = 0; i < nChannels; i++)
{
    channelFrames[i].data = new float[nRamps * nPoints * 2];
}

 for (int ramp = 0; ramp < nRamps; ramp++)
 {
     ptr += (int)rawHeaderSize; // Skip the header

     for (int point = 0; point < nPoints; point++)
     {
          for (int channel = 0; channel < nChannels; channel++)
          {
               Marshal.Copy(ptr, channelFrames[channel].data, (int)(point *2 + ramp*nPoints*2), 2);

               ptr += (sizeof(float) * 2); // Move to the next data                          
          }
     }
}

有关如何更快地完成此任务的任何想法?

1 个答案:

答案 0 :(得分:1)

Marshal.Copy()可能是性能瓶颈,因为它调用了非托管代码,而这个调用太昂贵了,只能复制2个浮点数。以下使用不安全代码(必须在项目属性中启用,方法必须使用unsafe修饰符进行修饰)以避免使用Marshal.Copy()并手动复制数据。内部循环(迭代通道)也会展开,以获得额外的性能提升(缺点是代码是4个通道的硬编码)。

我的测量结果显示,与原始方法相比,性能提升了近10倍。

//Pin arrays with channel data in memory and get pointers of these fixed arrays
fixed (float* fixed_ch0ptr = channelFrames[0].data)
fixed (float* fixed_ch1ptr = channelFrames[1].data)
fixed (float* fixed_ch2ptr = channelFrames[2].data)
fixed (float* fixed_ch3ptr = channelFrames[3].data)
{
    //fixed arrays pointer cannot be modified, we must create writable copies ot these pointers
    float* ch0ptr = fixed_ch0ptr;
    float* ch1ptr = fixed_ch1ptr;
    float* ch2ptr = fixed_ch2ptr;
    float* ch3ptr = fixed_ch3ptr;

    //unsafe pointer to array from sensor
    float* floatptr = (float*)ptr;

    for (int ramp = 0; ramp < nRamps; ramp++)
    {
        floatptr = (float*)((byte*)(floatptr) + (int)rawHeaderSize); // Skip the header

        for (int point = 0; point < nPoints; point++)
        {
            //Unrolling loop that iterates over channelFrames can give as some additional performance gains

            //copy channel 0 data
            *ch0ptr++ = *(floatptr++);
            *ch0ptr++ = *(floatptr++);

            //copy channel 1 data
            *ch1ptr++ = *(floatptr++);
            *ch1ptr++ = *(floatptr++);

            //copy channel 2 data
            *ch2ptr++ = *(floatptr++);
            *ch2ptr++ = *(floatptr++);

            //copy channel 3 data
            *ch3ptr++ = *(floatptr++);
            *ch3ptr++ = *(floatptr++);
        }
    }
}