创建一个常量但本地的数组

时间:2016-08-23 16:11:41

标签: c# .net arrays performance optimization

有时我需要一个硬编码的查找表用于单个方法。

我可以创建这样的数组

  • 本地方法本身
  • 类中的静态

第一种情况的示例:

public int Convert(int i)
{
    int[] lookup = new[] {1, 2, 4, 8, 16, 32, 666, /*...*/ };
    return lookup[i];
}

据我所知,每次执行此方法时,.net引擎都会创建一个新的查找数组。这是正确的,还是 JITer足够聪明,可以在调用之间缓存和重用数组?

我认为答案是否定的,所以如果我想确保数组在调用之间缓存,一种方法就是使它成为static

第二种情况的示例:

private static readonly int[] lookup = new[] { 1, 2, 4, 8, 16, 32, 666, /*...*/ };
public int Convert(int i)
{
    return lookup[i];
}

有没有办法在不污染我的类的命名空间的情况下执行此操作? 我可以以某种方式声明一个只在当前范围内可见的静态数组吗?

1 个答案:

答案 0 :(得分:44)

本地数组

Roslyn编译器将本地数组放入元数据中。我们来看看Convert方法的第一个版本:

public int Convert(int i)
{
    int[] lookup = new[] {1, 2, 4, 8, 16, 32, 666, /*...*/ };
    return lookup[i];
}

以下是相应的IL代码(发布版本,Roslyn 1.3.1.60616):

// Token: 0x06000002 RID: 2 RVA: 0x0000206C File Offset: 0x0000026C
.method public hidebysig 
    instance int32 Convert (
        int32 i
    ) cil managed noinlining 
{
    // Header Size: 1 byte
    // Code Size: 20 (0x14) bytes
    .maxstack 8

    /* 0x0000026D 1D           */ IL_0000: ldc.i4.7
    /* 0x0000026E 8D13000001   */ IL_0001: newarr    [mscorlib]System.Int32
    /* 0x00000273 25           */ IL_0006: dup
    /* 0x00000274 D001000004   */ IL_0007: ldtoken   field valuetype '<PrivateImplementationDetails>'/'__StaticArrayInitTypeSize=28' '<PrivateImplementationDetails>'::'502D7419C3650DEE94B5938147BC9B4724D37F99'
    /* 0x00000279 281000000A   */ IL_000C: call      void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle)
    /* 0x0000027E 03           */ IL_0011: ldarg.1
    /* 0x0000027F 94           */ IL_0012: ldelem.i4
    /* 0x00000280 2A           */ IL_0013: ret
} // end of method Program::Convert

这是PrivateImplementationDetails

// Token: 0x02000003 RID: 3
.class private auto ansi sealed '<PrivateImplementationDetails>'
    extends [mscorlib]System.Object
{
    .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = (
        01 00 00 00
    )
    // Nested Types
    // Token: 0x02000004 RID: 4
    .class nested private explicit ansi sealed '__StaticArrayInitTypeSize=28'
        extends [mscorlib]System.ValueType
    {
        .pack 1
        .size 28

    } // end of class __StaticArrayInitTypeSize=28


    // Fields
    // Token: 0x04000001 RID: 1 RVA: 0x00002944 File Offset: 0x00000B44
    .field assembly static initonly valuetype '<PrivateImplementationDetails>'/'__StaticArrayInitTypeSize=28' '502D7419C3650DEE94B5938147BC9B4724D37F99' at I_00002944 // 28 (0x001c) bytes

} // end of class <PrivateImplementationDetails>

如您所见,您的lookup数组位于程序集元数据中。当您启动应用程序时,JIT只需从元数据中获取数组内容。一个asm示例(Windows 10,.NET Framework 4.6.1(4.0.30319.42000),RyuJIT:clrjit-v4.6.1080.0,发布版本):

            int[] lookup = new[] { 1, 2, 4, 8, 16, 32, 666, /*...*/ };
00007FFEDF0A44E2  sub         esp,20h  
00007FFEDF0A44E5  mov         esi,edx  
00007FFEDF0A44E7  mov         rcx,7FFF3D1C4C62h  
00007FFEDF0A44F1  mov         edx,7  
00007FFEDF0A44F6  call        00007FFF3E6B2600  
00007FFEDF0A44FB  mov         rdx,134CF7F2944h  
00007FFEDF0A4505  mov         ecx,dword ptr [rax+8]  
00007FFEDF0A4508  lea         r8,[rax+10h]  
00007FFEDF0A450C  vmovdqu     xmm0,xmmword ptr [rdx]  
00007FFEDF0A4511  vmovdqu     xmmword ptr [r8],xmm0  
00007FFEDF0A4516  mov         r9,qword ptr [rdx+10h]  
00007FFEDF0A451A  mov         qword ptr [r8+10h],r9  
00007FFEDF0A451E  mov         r9d,dword ptr [rdx+18h]  
00007FFEDF0A4522  mov         dword ptr [r8+18h],r9d  
            return lookup[i];
00007FFEDF0A4526  cmp         esi,ecx  
            return lookup[i];
00007FFEDF0A4528  jae         00007FFEDF0A4537  
00007FFEDF0A452A  movsxd      rdx,esi  
00007FFEDF0A452D  mov         eax,dword ptr [rax+rdx*4+10h]  
00007FFEDF0A4531  add         rsp,20h  
00007FFEDF0A4535  pop         rsi  
00007FFEDF0A4536  ret  
00007FFEDF0A4537  call        00007FFF3EB57BE0  
00007FFEDF0A453C  int         3  

LegacyJIT-x64版本:

            int[] lookup = new[] { 1, 2, 4, 8, 16, 32, 666, /*...*/ };
00007FFEDF0E41E0  push        rbx  
00007FFEDF0E41E1  push        rdi  
00007FFEDF0E41E2  sub         rsp,28h  
00007FFEDF0E41E6  mov         ebx,edx  
00007FFEDF0E41E8  mov         edx,7  
00007FFEDF0E41ED  lea         rcx,[7FFF3D1C4C62h]  
00007FFEDF0E41F4  call        00007FFF3E6B2600  
00007FFEDF0E41F9  mov         rdi,rax  
00007FFEDF0E41FC  lea         rcx,[7FFEDF124760h]  
00007FFEDF0E4203  call        00007FFF3E73CA90  
00007FFEDF0E4208  mov         rdx,rax  
00007FFEDF0E420B  mov         rcx,rdi  
00007FFEDF0E420E  call        00007FFF3E73C8B0  
            return lookup[i];
00007FFEDF0E4213  movsxd      r11,ebx  
00007FFEDF0E4216  mov         rax,qword ptr [rdi+8]  
00007FFEDF0E421A  cmp         r11,7  
00007FFEDF0E421E  jae         00007FFEDF0E4230  
00007FFEDF0E4220  mov         eax,dword ptr [rdi+r11*4+10h]  
00007FFEDF0E4225  add         rsp,28h  
00007FFEDF0E4229  pop         rdi  
00007FFEDF0E422A  pop         rbx  
00007FFEDF0E422B  ret  
00007FFEDF0E422C  nop         dword ptr [rax]  
00007FFEDF0E4230  call        00007FFF3EB57BE0  
00007FFEDF0E4235  nop  

LegacyJIT-x86版本:

            int[] lookup = new[] { 1, 2, 4, 8, 16, 32, 666, /*...*/ };
009A2DC4  push        esi  
009A2DC5  push        ebx  
009A2DC6  mov         ebx,edx  
009A2DC8  mov         ecx,6A2C402Eh  
009A2DCD  mov         edx,7  
009A2DD2  call        0094322C  
009A2DD7  lea         edi,[eax+8]  
009A2DDA  mov         esi,5082944h  
009A2DDF  mov         ecx,7  
009A2DE4  rep movs    dword ptr es:[edi],dword ptr [esi]  
            return lookup[i];
009A2DE6  cmp         ebx,dword ptr [eax+4]  
009A2DE9  jae         009A2DF4  
009A2DEB  mov         eax,dword ptr [eax+ebx*4+8]  
009A2DEF  pop         ebx  
009A2DF0  pop         esi  
009A2DF1  pop         edi  
009A2DF2  pop         ebp  
009A2DF3  ret  
009A2DF4  call        6B9D52F0  
009A2DF9  int         3  

静态数组

现在,让我们将它与第二个版本进行比较:

private static readonly int[] lookup = new[] { 1, 2, 4, 8, 16, 32, 666, /*...*/ };

public int Convert(int i)
{            
    return lookup[i];
}

IL:

// Token: 0x04000001 RID: 1
.field private static initonly int32[] lookup

// Token: 0x06000002 RID: 2 RVA: 0x00002056 File Offset: 0x00000256
.method public hidebysig 
    instance int32 Convert (
        int32 i
    ) cil managed noinlining 
{
    // Header Size: 1 byte
    // Code Size: 8 (0x8) bytes
    .maxstack 8

    /* 0x00000257 7E01000004   */ IL_0000: ldsfld    int32[] ConsoleApplication5.Program::lookup
    /* 0x0000025C 03           */ IL_0005: ldarg.1
    /* 0x0000025D 94           */ IL_0006: ldelem.i4
    /* 0x0000025E 2A           */ IL_0007: ret
} // end of method Program::Convert

// Token: 0x02000003 RID: 3
.class private auto ansi sealed '<PrivateImplementationDetails>'
    extends [mscorlib]System.Object
{
    .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = (
        01 00 00 00
    )
    // Nested Types
    // Token: 0x02000004 RID: 4
    .class nested private explicit ansi sealed '__StaticArrayInitTypeSize=28'
        extends [mscorlib]System.ValueType
    {
        .pack 1
        .size 28

    } // end of class __StaticArrayInitTypeSize=28


    // Fields
    // Token: 0x04000002 RID: 2 RVA: 0x000028FC File Offset: 0x00000AFC
    .field assembly static initonly valuetype '<PrivateImplementationDetails>'/'__StaticArrayInitTypeSize=28' '502D7419C3650DEE94B5938147BC9B4724D37F99' at I_000028fc // 28 (0x001c) bytes

} // end of class <PrivateImplementationDetails>

ASM(RyuJIT-x64):

            return lookup[i];
00007FFEDF0B4490  sub         rsp,28h  
00007FFEDF0B4494  mov         rax,212E52E0080h  
00007FFEDF0B449E  mov         rax,qword ptr [rax]  
00007FFEDF0B44A1  mov         ecx,dword ptr [rax+8]  
00007FFEDF0B44A4  cmp         edx,ecx  
00007FFEDF0B44A6  jae         00007FFEDF0B44B4  
00007FFEDF0B44A8  movsxd      rdx,edx  
00007FFEDF0B44AB  mov         eax,dword ptr [rax+rdx*4+10h]  
00007FFEDF0B44AF  add         rsp,28h  
00007FFEDF0B44B3  ret  
00007FFEDF0B44B4  call        00007FFF3EB57BE0  
00007FFEDF0B44B9  int         3  

ASM(LegacyJIT-x64):

            return lookup[i];
00007FFEDF0A4611  sub         esp,28h  
00007FFEDF0A4614  mov         rcx,226CC5203F0h  
00007FFEDF0A461E  mov         rcx,qword ptr [rcx]  
00007FFEDF0A4621  movsxd      r8,edx  
00007FFEDF0A4624  mov         rax,qword ptr [rcx+8]  
00007FFEDF0A4628  cmp         r8,rax  
00007FFEDF0A462B  jae         00007FFEDF0A4637  
00007FFEDF0A462D  mov         eax,dword ptr [rcx+r8*4+10h]  
00007FFEDF0A4632  add         rsp,28h  
00007FFEDF0A4636  ret  
00007FFEDF0A4637  call        00007FFF3EB57BE0  
00007FFEDF0A463C  nop  

ASM(LegacyJIT-x86):

            return lookup[i];
00AA2E18  push        ebp  
00AA2E19  mov         ebp,esp  
00AA2E1B  mov         eax,dword ptr ds:[03628854h]  
00AA2E20  cmp         edx,dword ptr [eax+4]  
00AA2E23  jae         00AA2E2B  
00AA2E25  mov         eax,dword ptr [eax+edx*4+8]  
00AA2E29  pop         ebp  
00AA2E2A  ret  
00AA2E2B  call        6B9D52F0  
00AA2E30  int         3  

基准

让我们在BenchmarkDotNet

的帮助下写一个基准
[Config(typeof(Config)), LegacyJitX86Job, LegacyJitX64Job, RyuJitX64Job, RPlotExporter]
public class ArrayBenchmarks
{
    private static readonly int[] lookup = new[] {1, 2, 4, 8, 16, 32, 666, /*...*/};

    [MethodImpl(MethodImplOptions.NoInlining)]
    public int ConvertStatic(int i)
    {
        return lookup[i];
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public int ConvertLocal(int i)
    {
        int[] localLookup = new[] {1, 2, 4, 8, 16, 32, 666, /*...*/};
        return localLookup[i];
    }

    [Benchmark]
    public int Static()
    {
        int sum = 0;
        for (int i = 0; i < 10001; i++)
            sum += ConvertStatic(0);
        return sum;
    }

    [Benchmark]
    public int Local()
    {
        int sum = 0;
        for (int i = 0; i < 10001; i++)
            sum += ConvertLocal(0);
        return sum;
    }

    private class Config : ManualConfig
    {
        public Config()
        {
            Add(new MemoryDiagnoser());                
            Add(MarkdownExporter.StackOverflow);
        }
    }
}

请注意,它是一个合成玩具基准,使用NoInlining方法Convert。我们用它来显示两种方法之间的区别。真正的性能取决于您在代码中使用Convert方法的方式。我的结果:

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-4702MQ CPU 2.20GHz, ProcessorCount=8
Frequency=2143474 ticks, Resolution=466.5324 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=64-bit RELEASE [RyuJIT]
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1586.0

Type=ArrayBenchmarks  Mode=Throughput  

 Method | Platform |       Jit |        Median |     StdDev |    Gen 0 | Gen 1 | Gen 2 | Bytes Allocated/Op |
------- |--------- |---------- |-------------- |----------- |--------- |------ |------ |------------------- |
 Static |      X64 | LegacyJit |    24.0243 us |  0.1590 us |        - |     - |     - |               1.07 |
  Local |      X64 | LegacyJit | 2,068.1034 us | 33.7142 us | 1,089.00 |     - |     - |         436,603.02 |
 Static |      X64 |    RyuJit |    20.7906 us |  0.2018 us |        - |     - |     - |               1.06 |
  Local |      X64 |    RyuJit |    83.4041 us |  0.9993 us |   613.55 |     - |     - |         244,936.53 |
 Static |      X86 | LegacyJit |    20.9957 us |  0.2267 us |        - |     - |     - |               1.01 |
  Local |      X86 | LegacyJit |   167.6257 us |  1.3543 us |   431.43 |     - |     - |         172,121.77 |

enter image description here

结论

  • .NET是否缓存硬编码的本地数组?种类:Roslyn编译器将其放入元数据中。
  • 在这种情况下我们是否有任何开销?不幸的是,是的:JIT将从每次调用的元数据中复制数组内容;它的工作时间比静态数组的情况要长。运行时还会分配对象并产生内存流量。
  • 我们应该关心它吗?这取决于。如果它是一种热门方法并且您希望获得良好的性能水平,则应使用静态数组。如果它是一个不影响应用程序性能的冷方法,那么可能应编写“好”的源代码并将数组放在方法范围内。