Question

这是进行效果分析的有效方法吗？我想获得纳秒精度并确定类型转换的性能：

class PerformanceTest
{
    static double last = 0.0;
    static List<object> numericGenericData = new List<object>();
    static List<double> numericTypedData = new List<double>();

    static void Main(string[] args)
    {
        double totalWithCasting = 0.0;
        double totalWithoutCasting = 0.0;
        for (double d = 0.0; d < 1000000.0; ++d)
        {
            numericGenericData.Add(d);
            numericTypedData.Add(d);
        }
        Stopwatch stopwatch = new Stopwatch();
        for (int i = 0; i < 10; ++i)
        {

            stopwatch.Start();
            testWithTypecasting();
            stopwatch.Stop();
            totalWithCasting += stopwatch.ElapsedTicks;

            stopwatch.Start();
            testWithoutTypeCasting();
            stopwatch.Stop();
            totalWithoutCasting += stopwatch.ElapsedTicks;
        }

        Console.WriteLine("Avg with typecasting = {0}", (totalWithCasting/10));
        Console.WriteLine("Avg without typecasting = {0}", (totalWithoutCasting/10));
        Console.ReadKey();
    }

    static void testWithTypecasting()
    {
        foreach (object o in numericGenericData)
        {
            last = ((double)o*(double)o)/200;
        }
    }

    static void testWithoutTypeCasting()
    {
        foreach (double d in numericTypedData)
        {
            last = (d * d)/200;
        }
    }
}

输出结果为：

Avg with typecasting = 468872.3
Avg without typecasting = 501157.9

我有点怀疑......看起来对性能的影响几乎没有。铸造真的那么便宜吗？

更新：

class PerformanceTest
{
    static double last = 0.0;
    static object[] numericGenericData = new object[100000];
    static double[] numericTypedData = new double[100000];

    static Stopwatch stopwatch = new Stopwatch();
    static double totalWithCasting = 0.0;
    static double totalWithoutCasting = 0.0;
    static void Main(string[] args)
    {
        for (int i = 0; i < 100000; ++i)
        {
            numericGenericData[i] = (double)i;
            numericTypedData[i] = (double)i;
        }

        for (int i = 0; i < 10; ++i)
        {
            stopwatch.Start();
            testWithTypecasting();
            stopwatch.Stop();
            totalWithCasting += stopwatch.ElapsedTicks;
            stopwatch.Reset();

            stopwatch.Start();
            testWithoutTypeCasting();
            stopwatch.Stop();
            totalWithoutCasting += stopwatch.ElapsedTicks;
            stopwatch.Reset();
        }

        Console.WriteLine("Avg with typecasting = {0}", (totalWithCasting/(10.0)));
        Console.WriteLine("Avg without typecasting = {0}", (totalWithoutCasting / (10.0)));
        Console.ReadKey();
    }

    static void testWithTypecasting()
    {
        foreach (object o in numericGenericData)
        {
            last = ((double)o * (double)o) / 200;
        }
    }

    static void testWithoutTypeCasting()
    {
        foreach (double d in numericTypedData)
        {
            last = (d * d) / 200;
        }
    }
}

输出结果为：

Avg with typecasting = 4791
Avg without typecasting = 3303.9

Answer 1

请注意，您测量的不是类型转换，而是取消装箱。这些值一直是双打，没有类型转换。

您忘记在测试之间重置秒表，因此您反复添加所有先前测试的累计时间。如果将刻度转换为实际时间，您会发现它的累计时间远远超过运行测试所需的时间。

如果您在每个stopwatch.Reset();之前添加stopwatch.Start();，您会得到更合理的结果，例如：

Avg with typecasting = 41027,1
Avg without typecasting = 20594,3

取消装箱值并不是那么昂贵，只需要检查对象中的数据类型是否正确，然后获取值。仍然比已知类型的工作要多得多。请记住，您还在测量结果的循环，计算和分配，这两个测试都是相同的。

装箱值比拆箱价格更贵，因为它会在堆上分配一个对象。

Answer 2

1）是的，铸造通常（非常）便宜。

2）您不会在托管语言中获得纳秒精度。或者在大多数操作系统下使用非托管语言。

考虑

其他流程
垃圾收集
不同的JITters
不同的CPU

并且，您的测量包括foreach循环，对我来说看起来像50％或更多。也许是90％。

Answer 3

当你打电话给秒表时，它会让计时器继续从它停止的任何地方开始运行。您需要调用Stopwatch.Reset（）以在再次启动之前将计时器设置回零。就我个人而言，只要我想启动计时器以避免这种混淆，我就会使用秒表= Stopwatch.StartNew（）。

此外，您可能希望在启动“计时循环”之前调用两个测试方法，以便他们有机会“热身”这段代码并确保JIT有机会运行甚至是比赛场地。

当我在我的机器上执行此操作时，我发现testWithTypecasting在大约一半的时间内运行为testWithoutTypeCasting。

然而，正如所说的，演员本身不太可能是该表现惩罚中最重要的部分。 testWithTypecasting方法在盒装双精度列表上运行，这意味着除了增加消耗的内存总量之外，还需要额外的间接级别来检索每个值（遵循对内存中其他位置的值的引用）。这增加了内存访问所花费的时间，并且可能比“在演员阵容”中花费的CPU时间更大。

Answer 4

查看System.Diagnostics命名空间中的性能计数器，在创建新计数器时，首先创建一个类别，然后指定一个或多个要放入其中的计数器。

    // Create a collection of type CounterCreationDataCollection.
System.Diagnostics.CounterCreationDataCollection CounterDatas = 
   new System.Diagnostics.CounterCreationDataCollection();
// Create the counters and set their properties.
System.Diagnostics.CounterCreationData cdCounter1 = 
   new System.Diagnostics.CounterCreationData();
System.Diagnostics.CounterCreationData cdCounter2 = 
   new System.Diagnostics.CounterCreationData();
cdCounter1.CounterName = "Counter1";
cdCounter1.CounterHelp = "help string1";
cdCounter1.CounterType = System.Diagnostics.PerformanceCounterType.NumberOfItems64;
cdCounter2.CounterName = "Counter2";
cdCounter2.CounterHelp = "help string 2";
cdCounter2.CounterType = System.Diagnostics.PerformanceCounterType.NumberOfItems64;
// Add both counters to the collection.
CounterDatas.Add(cdCounter1);
CounterDatas.Add(cdCounter2);
// Create the category and pass the collection to it.
System.Diagnostics.PerformanceCounterCategory.Create(
   "Multi Counter Category", "Category help", CounterDatas);

请参阅MSDN docs

Answer 5

只是一个想法，但有时相同的机器代码可能需要执行不同的循环次数，具体取决于它在内存中的对齐方式，因此您可能需要添加一个控件或控件。

Answer 6

不要自己“做”C＃，而是用C表示x86-32以及之后的rdtsc指令通常可用，它比OS滴答更准确。有关rdtsc的更多信息可以通过搜索stackoverflow找到。在C下，它通常作为内部函数或内置函数提供，并返回自计算机启动以来的时钟周期数（8字节 - 长long / __ int64 - 无符号整数）。因此，如果CPU的时钟速度为3 Ghz，则底层计数器每秒增加30亿次。除了一些早期的AMD处理器外，所有多核CPU的计数器都会同步。

如果C＃没有它，您可以考虑编写一个非常短的C函数来从C＃访问它。如果通过函数vs内联访问指令，则会产生大量开销。两次背靠背调用函数之间的差异将是基本的测量开销。如果您正在考虑计量应用程序，则必须确定几个更复杂的开销值。

您可以考虑关闭CPU节能模式（并重新启动PC），因为它会降低在低活动期间输入CPU的时钟频率。这是因为它导致不同核心的时间戳计数器变得不同步。

C＃性能分析 - 如何计算CPU周期？

6 个答案: