在包含固定数组的托管不安全结构上,C#fixed语句的开销是多少?

时间:2011-12-13 22:18:35

标签: c# clr fixed unsafe

我一直在尝试确定在C#中使用fixed语句对于包含固定数组的托管不安全结构的真实成本。请注意,我不是指非托管结构。

具体来说,有没有理由避免下面的'MultipleFixed'类显示的模式?简单地修复数据的成本是非零,接近零(= =成本类似于在进入/退出固定范围时设置和清除单个标志),还是足够重要以避免在可能的情况下?

显然,这些课程是为了帮助解释这个问题而设计的。这是针对XNA游戏中的高使用率数据结构,其中此数据的读/写性能至关重要,因此如果我需要修复数组并将其传递到任何地方,我会这样做,但如果没有任何差别我我更喜欢将fixed()本地保留在方法中,以帮助保持函数签名对于不支持不安全代码的平台更加轻松。 (是的,它的一些额外的咕噜声代码,但不管它需要什么......)

 

    unsafe struct ByteArray
    {
       public fixed byte Data[1024];
    }

    class MultipleFixed
    {
       unsafe void SetValue(ref ByteArray bytes, int index, byte value)
       {
           fixed(byte* data = bytes.Data)
           {
               data[index] = value;
           }
       }

        unsafe bool Validate(ref ByteArray bytes, int index, byte expectedValue)
        {
           fixed(byte* data = bytes.Data)
           {
               return data[index] == expectedValue;
           }
        }

        void Test(ref ByteArray bytes)
        {
            SetValue(ref bytes, 0, 1);
            Validate(ref bytes, 0, 1);
        }
    }

    class SingleFixed
    {
       unsafe void SetValue(byte* data, int index, byte value)
       {
           data[index] = value;
       }

        unsafe bool Validate(byte* data, int index, byte expectedValue)
        {
           return data[index] == expectedValue;
        }

        unsafe void Test(ref ByteArray bytes)
        {
            fixed(byte* data = bytes.Data)
            {
                SetValue(data, 0, 1);
                Validate(data, 0, 1);
            }
        }
    }

此外,我查找了类似的问题,我找到的最接近的问题是this,但这个问题的不同之处在于它只涉及纯托管代码以及在该上下文中使用fixed的具体成本。

感谢您的任何信息!

2 个答案:

答案 0 :(得分:11)

这实际上是我自己有趣的问题。

我设法获得的结果表明性能损失的原因略有不同而不是“固定”。声明本身。

你可以看到我运行的测试和下面的结果,但我从中得出以下观察结果:

  • 使用'固定'的表现使用纯指针(x *),没有IntPtr,就像在托管代码中一样好;在发布模式下,如果不经常使用固定,它甚至可以更好 - 这是访问多个数组值的最佳方式
  • 在调试模式下,使用' fixed' (在循环内部)具有很大的负面性能影响但在发布模式下,它的工作方式几乎与普通数组访问一样好(方法FixedAccess);
  • 使用' ref'在引用类型参数值(float [])上始终具有更高或同等性能(两种模式)
  • 使用IntPtr算法(IntPtrAccess)时,调试模式与释放模式相比具有显着的性能下降,但对于两种模式,性能都比正常的阵列访问更差
  • 如果使用偏移未与数组的值对齐'偏移,性能很糟糕,无论模式如何(两种模式实际需要相同的时间)。这适用于' float'但对' int'
  • 没有影响

多次运行测试,结果略有不同但大致一致。可能我应该运行许多系列的测试并花费平均时间 - 但没有时间:)

测试类首先:

class Test {
    public static void NormalAccess (float[] array, int index) {
        array[index] = array[index] + 2;
    }

    public static void NormalRefAccess (ref float[] array, int index) {
        array[index] = array[index] + 2;
    }

    public static void IntPtrAccess (IntPtr arrayPtr, int index) {
        unsafe {
            var array = (float*) IntPtr.Add (arrayPtr, index << 2);
            (*array) = (*array) + 2;
        }
    }

    public static void IntPtrMisalignedAccess (IntPtr arrayPtr, int index) {
        unsafe {
            var array = (float*) IntPtr.Add (arrayPtr, index); // getting bits of a float
            (*array) = (*array) + 2;
        }
    }

    public static void FixedAccess (float[] array, int index) {
        unsafe {
            fixed (float* ptr = &array[index]) 
                (*ptr) = (*ptr) + 2;
        }
    }

    public unsafe static void PtrAccess (float* ptr) {
        (*ptr) = (*ptr) + 2;
    }

}

测试本身:

    static int runs = 1000*1000*100;
    public static void Print (string name, Stopwatch sw) {
        Console.WriteLine ("{0}, items/sec = {1:N} \t {2}", sw.Elapsed, (runs / sw.ElapsedMilliseconds) * 1000, name);
    }

    static void Main (string[] args) {
        var buffer = new float[1024*1024*100];
        var len = buffer.Length;

        var sw = new Stopwatch();
        for (int i = 0; i < 1000; i++) { 
            Test.FixedAccess (buffer, 55);
            Test.NormalAccess (buffer, 66);
        }

        Console.WriteLine ("Starting {0:N0} items", runs);


        sw.Restart ();
        for (int i = 0; i < runs; i++) 
            Test.NormalAccess (buffer, i % len);
        sw.Stop ();

        Print ("Normal access", sw);

        sw.Restart ();
        for (int i = 0; i < runs; i++) 
            Test.NormalRefAccess (ref buffer, i % len);
        sw.Stop ();

        Print ("Normal Ref access", sw);

        sw.Restart ();
        unsafe {
            fixed (float* ptr = &buffer[0])
                for (int i = 0; i < runs; i++) { 
                    Test.IntPtrAccess ((IntPtr) ptr, i % len);
                }
        }
        sw.Stop ();

        Print ("IntPtr access (fixed outside loop)", sw);

        sw.Restart ();
        unsafe {
            fixed (float* ptr = &buffer[0])
                for (int i = 0; i < runs; i++) { 
                    Test.IntPtrMisalignedAccess ((IntPtr) ptr, i % len);
                }
        }
        sw.Stop ();

        Print ("IntPtr Misaligned access (fixed outside loop)", sw);

        sw.Restart ();
        for (int i = 0; i < runs; i++) 
            Test.FixedAccess (buffer, i % len);
        sw.Stop ();

        Print ("Fixed access (fixed inside loop)", sw);

        sw.Restart ();
        unsafe {
            fixed (float* ptr = &buffer[0]) { 
                for (int i = 0; i < runs; i++) { 
                    Test.PtrAccess (ptr + (i % len));
                }
            }
        }
        sw.Stop ();

        Print ("float* access (fixed outside loop)", sw);

        sw.Restart ();
        unsafe {
            for (int i = 0; i < runs; i++) { 
                fixed (float* ptr = &buffer[i % len]) { 
                    Test.PtrAccess (ptr);
                }
            }
        }
        sw.Stop ();

        Print ("float* access (fixed in loop)", sw);

最后结果:

调试模式

Starting 100,000,000 items
00:00:01.0373583, items/sec = 96,432,000.00      Normal access
00:00:00.8582307, items/sec = 116,550,000.00     Normal Ref access
00:00:01.8822085, items/sec = 53,134,000.00      IntPtr access (fixed outside loop)
00:00:10.5356369, items/sec = 9,492,000.00       IntPtr Misaligned access (fixed outside loop)
00:00:01.6860701, items/sec = 59,311,000.00      Fixed access (fixed inside loop)
00:00:00.7577868, items/sec = 132,100,000.00     float* access (fixed outside loop)
00:00:01.0387792, items/sec = 96,339,000.00      float* access (fixed in loop)

发布模式

Starting 100,000,000 items
00:00:00.7454832, items/sec = 134,228,000.00     Normal access
00:00:00.6619090, items/sec = 151,285,000.00     Normal Ref access
00:00:00.9859089, items/sec = 101,522,000.00     IntPtr access (fixed outside loop)
00:00:10.1289018, items/sec = 9,873,000.00       IntPtr Misaligned access (fixed outside loop)
00:00:00.7899355, items/sec = 126,742,000.00     Fixed access (fixed inside loop)
00:00:00.5718507, items/sec = 175,131,000.00     float* access (fixed outside loop)
00:00:00.6842333, items/sec = 146,198,000.00     float* access (fixed in loop)

答案 1 :(得分:8)

根据经验,在最佳情况下,开销似乎是32位JIT上的~270%和64位上的~200%(并且“调用”fixed的次数越多,开销越大)。因此,如果性能非常关键,我会尽量减少fixed块。

抱歉,我对固定/不安全代码不太熟悉,知道原因是什么


<强>详情

我还添加了一些TestMore方法,它们将您的两种测试方法调用10次而不是2次 给出一个更真实的场景,在fixed结构上调用多个方法。

我使用的代码:

class Program
{
    static void Main(string[] args)
    {
        var someData = new ByteArray();
        int iterations = 1000000000;
        var multiple = new MultipleFixed();
        var single = new SingleFixed();

        // Warmup.
        for (int i = 0; i < 100; i++)
        {
            multiple.Test(ref someData);
            single.Test(ref someData);
            multiple.TestMore(ref someData);
            single.TestMore(ref someData);
        }

        // Environment.
        if (Debugger.IsAttached)
            Console.WriteLine("Debugger is attached!!!!!!!!!! This run is invalid!");
        Console.WriteLine("CLR Version: " + Environment.Version);
        Console.WriteLine("Pointer size: {0} bytes", IntPtr.Size);
        Console.WriteLine("Iterations: " + iterations);

        Console.Write("Starting run for Single... ");
        var sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            single.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);

        Console.Write("Starting run for More Single... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            single.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);


        Console.Write("Starting run for Multiple... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            multiple.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);

        Console.Write("Starting run for More Multiple... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            multiple.TestMore(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);


        Console.ReadLine();
    }
}

unsafe struct ByteArray
{
    public fixed byte Data[1024];
}

class MultipleFixed
{
    unsafe void SetValue(ref ByteArray bytes, int index, byte value)
    {
        fixed (byte* data = bytes.Data)
        {
            data[index] = value;
        }
    }

    unsafe bool Validate(ref ByteArray bytes, int index, byte expectedValue)
    {
        fixed (byte* data = bytes.Data)
        {
            return data[index] == expectedValue;
        }
    }

    public void Test(ref ByteArray bytes)
    {
        SetValue(ref bytes, 0, 1);
        Validate(ref bytes, 0, 1);
    }
    public void TestMore(ref ByteArray bytes)
    {
        SetValue(ref bytes, 0, 1);
        Validate(ref bytes, 0, 1);
        SetValue(ref bytes, 0, 2);
        Validate(ref bytes, 0, 2);
        SetValue(ref bytes, 0, 3);
        Validate(ref bytes, 0, 3);
        SetValue(ref bytes, 0, 4);
        Validate(ref bytes, 0, 4);
        SetValue(ref bytes, 0, 5);
        Validate(ref bytes, 0, 5);
    }
}

class SingleFixed
{
    unsafe void SetValue(byte* data, int index, byte value)
    {
        data[index] = value;
    }

    unsafe bool Validate(byte* data, int index, byte expectedValue)
    {
        return data[index] == expectedValue;
    }

    public unsafe void Test(ref ByteArray bytes)
    {
        fixed (byte* data = bytes.Data)
        {
            SetValue(data, 0, 1);
            Validate(data, 0, 1);
        }
    }
    public unsafe void TestMore(ref ByteArray bytes)
    {
        fixed (byte* data = bytes.Data)
        {
            SetValue(data, 0, 1);
            Validate(data, 0, 1);
            SetValue(data, 0, 2);
            Validate(data, 0, 2);
            SetValue(data, 0, 3);
            Validate(data, 0, 3);
            SetValue(data, 0, 4);
            Validate(data, 0, 4);
            SetValue(data, 0, 5);
            Validate(data, 0, 5);
        }
    }
}

.NET 4.0中的结果,32位JIT:

CLR Version: 4.0.30319.239
Pointer size: 4 bytes
Iterations: 1000000000
Starting run for Single... Completed in 2,092.350ms - 477,931,580.94/sec
Starting run for More Single... Completed in 2,236.767ms - 447,073,934.63/sec
Starting run for Multiple... Completed in 5,775.922ms - 173,132,528.92/sec
Starting run for More Multiple... Completed in 26,637.862ms - 37,540,550.36/sec

在.NET 4.0中,64位JIT:

CLR Version: 4.0.30319.239
Pointer size: 8 bytes
Iterations: 1000000000
Starting run for Single... Completed in 2,907.946ms - 343,885,316.72/sec
Starting run for More Single... Completed in 2,904.903ms - 344,245,585.63/sec
Starting run for Multiple... Completed in 5,754.893ms - 173,765,185.93/sec
Starting run for More Multiple... Completed in 18,679.593ms - 53,534,358.13/sec