在C#中尽可能快地将数组复制到struct数组

时间:2014-08-14 15:01:42

标签: c# arrays unity3d marshalling

我正在使用Unity 4.5,将图像作为字节数组(每个字节代表一个通道,每个像素占用4个字节(rgba))并将其显示在将数组转换为Color32数组的纹理上,使用此循环:

   img = new Color32[byteArray.Length / nChannels]; //nChannels being 4
   for (int i=0; i< img.Length; i++) {
        img[i].r = byteArray[i*nChannels];
        img[i].g = byteArray[i*nChannels+1];
        img[i].b = byteArray[i*nChannels+2];
        img[i].a = byteArray[i*nChannels+3];
    }

然后,使用以下方法将其应用于纹理:

tex.SetPixels32(img);

但是,这会显着降低应用程序的速度(此循环在每一帧上执行),我想知道是否有其他方法可以加快复制过程。我找到了一些人(Fast copy of Color32[] array to byte[] array)使用Marshal.Copy函数来执行相反的过程(Color32到字节数组),但我无法做到努力将字节数组复制到Color32数组。有人知道更快的方式吗?

提前谢谢!

5 个答案:

答案 0 :(得分:12)

是的,Marshal.Copy是可行的方法。我已经回答了类似的问题here

这是从struct []复制到byte []的通用方法,反之亦然

private static byte[] ToByteArray<T>(T[] source) where T : struct
{
    GCHandle handle = GCHandle.Alloc(source, GCHandleType.Pinned);
    try
    {
        IntPtr pointer = handle.AddrOfPinnedObject();
        byte[] destination = new byte[source.Length * Marshal.SizeOf(typeof(T))];
        Marshal.Copy(pointer, destination, 0, destination.Length);
        return destination;
    }
    finally
    {
        if (handle.IsAllocated)
            handle.Free();
    }
}

private static T[] FromByteArray<T>(byte[] source) where T : struct
{
    T[] destination = new T[source.Length / Marshal.SizeOf(typeof(T))];
    GCHandle handle = GCHandle.Alloc(destination, GCHandleType.Pinned);
    try
    {
        IntPtr pointer = handle.AddrOfPinnedObject();
        Marshal.Copy(source, 0, pointer, source.Length);
        return destination;
    }
    finally
    {
        if (handle.IsAllocated)
            handle.Free();
    }
}

将其用作:

[StructLayout(LayoutKind.Sequential)]
public struct Demo
{
    public double X;
    public double Y;
}

private static void Main()
{
    Demo[] array = new Demo[2];
    array[0] = new Demo { X = 5.6, Y = 6.6 };
    array[1] = new Demo { X = 7.6, Y = 8.6 };

    byte[] bytes = ToByteArray(array);
    Demo[] array2 = FromByteArray<Demo>(bytes);
}

答案 1 :(得分:8)

此代码需要不安全开关,但应该很快。我认为你应该对这些答案进行基准测试......

var bytes = new byte[] { 1, 2, 3, 4 };

var colors = MemCopyUtils.ByteArrayToColor32Array(bytes);

public class MemCopyUtils
{
    unsafe delegate void MemCpyDelegate(byte* dst, byte* src, int len);
    static MemCpyDelegate MemCpy;

    static MemCopyUtils()
    {
        InitMemCpy();
    }

    static void InitMemCpy()
    {
        var mi = typeof(Buffer).GetMethod(
            name: "Memcpy",
            bindingAttr: BindingFlags.NonPublic | BindingFlags.Static,
            binder:  null,
            types: new Type[] { typeof(byte*), typeof(byte*), typeof(int) },
            modifiers: null);
        MemCpy = (MemCpyDelegate)Delegate.CreateDelegate(typeof(MemCpyDelegate), mi);
    }

    public unsafe static Color32[] ByteArrayToColor32Array(byte[] bytes)
    {
        Color32[] colors = new Color32[bytes.Length / sizeof(Color32)];

        fixed (void* tempC = &colors[0])
        fixed (byte* pBytes = bytes)
        {
            byte* pColors = (byte*)tempC;
            MemCpy(pColors, pBytes, bytes.Length);
        }
        return colors;
    }
}

答案 2 :(得分:1)

使用Parallel.For可能会显着提升性能。

img = new Color32[byteArray.Length / nChannels]; //nChannels being 4
Parallel.For(0, img.Length, i =>
{
    img[i].r = byteArray[i*nChannels];
    img[i].g = byteArray[i*nChannels+1];
    img[i].b = byteArray[i*nChannels+2];
    img[i].a = byteArray[i*nChannels+3];
});

Example on MSDN

答案 3 :(得分:1)

public Color32[] GetColorArray(byte[] myByte)
{
    if (myByte.Length % 1 != 0) 
       throw new Exception("Must have an even length");

    var colors = new Color32[myByte.Length / nChannels];

    for (var i = 0; i < myByte.Length; i += nChannels)
    {
       colors[i / nChannels] = new Color32(
           (byte)(myByte[i] & 0xF8),
           (byte)(((myByte[i] & 7) << 5) | ((myByte[i + 1] & 0xE0) >> 3)),
           (byte)((myByte[i + 1] & 0x1F) << 3),
           (byte)1);
    }

    return colors;
}

工作速度比i++快30-50倍。 “额外”只是造型。这段代码在一个“行”中,在for循环中执行,您在4行中声明的内容加上它更快。干杯:)

参考+参考代码:Here

答案 4 :(得分:1)

我还没有对它进行分析,但使用fixed来确保您的内存不会被移动并删除对数组访问的边界检查可能会带来一些好处:

img = new Color32[byteArray.Length / nChannels]; //nChannels being 4
fixed (byte* ba = byteArray)
{
    fixed (Color32* c = img)
    {
        byte* byteArrayPtr = ba;
        Color32* colorPtr = c;
        for (int i = 0; i < img.Length; i++)
        {
            (*colorPtr).r = *byteArrayPtr++;
            (*colorPtr).g = *byteArrayPtr++;
            (*colorPtr).b = *byteArrayPtr++;
            (*colorPtr).a = *byteArrayPtr++;
            colorPtr++;
        }
    }
}

它可能无法在64位系统上提供更多好处 - 我相信边界检查更加优化。此外,这是一个unsafe操作,所以要小心。