Question

遇到过一种简单的.net fibonniacci代码在一组特定服务器上速度较慢的情况，唯一明显不同的是CPU。 AMD Opteron处理器6276 - 11秒英特尔至强XPU E7 - 4850 - 7秒

代码遵循x86并使用.NET framework 4.0。两者之间的速度相似，实际上PassMark基准测试为AMD提供了更高的分数。 - 在农场的其他AMD服务器上试过这个，时间慢了。 - 即使我的本地I7机器运行代码更快。

Fibonnacci代码：

class Program
{
    static void Main(string[] args)
    {
        const int ITERATIONS = 10000;
        const int FIBONACCI = 100000;

        var watch = new Stopwatch();
        watch.Start();


        DoFibonnacci(ITERATIONS, FIBONACCI);

        watch.Stop();

        Console.WriteLine("Total fibonacci time: {0}ms", watch.ElapsedMilliseconds);
        Console.ReadLine();
    }

    private static void DoFibonnacci(int ITERATIONS, int FIBONACCI)
    {
        for (int i = 0; i < ITERATIONS; i++)
        {
            Fibonacci(FIBONACCI);
        }
    }

    private static int Fibonacci(int x)
    {
        var previousValue = -1;
        var currentResult = 1;

        for (var i = 0; i <= x; ++i)
        {
            var sum = currentResult + previousValue;
            previousValue = currentResult;
            currentResult = sum;
        }

        return currentResult;
    }

}

关于可能发生的事情的任何想法？

Answer 1

正如我们在评论中所建立的那样，您可以通过将流程固定到AMD Opteron计算机上的特定处理器来解决此性能问题。

通过这个不是真正主题的问题，我决定看看单核心固定会产生这种差异的可能情况（从11到7秒似乎有点极端）。

最合理的答案不是革命性的：

AMD Opteron系列在所谓的NUMA架构中使用HyperTransport，而不是像在英特尔的SMP CPU（包括Xeon 4850）上找到的传统FSB

我的猜测是，这种症状源于这样一个事实，即NUMA架构中的各个节点具有单独的缓存，而不是共享处理器缓存的Intel CPU。

换句话说，当连续计算在Opteron上的节点之间切换时，缓存被刷新，而SMP架构中的处理器之间的平衡（如Xeon 4850）没有这样的影响，因为缓存是共享的。

在.NET中设置亲和力非常简单，只需选择一个处理器（为简单起见，我们只需要使用第一个处理器）：

static void Main(string[] args)
{
    Console.WriteLine(Environment.ProcessorCount);
    Console.Read();

    //An AffinityMask of 0x0001 will make sure the process is always pinned to processer 0
    Process thisProcess = Process.GetCurrentProcess();
    thisProcess.ProcessorAffinity = (IntPtr)0x0001; 

    const int ITERATIONS = 10000;
    const int FIBONACCI = 100000;

    var watch = new Stopwatch();
    watch.Start();


    DoFibonnacci(ITERATIONS, FIBONACCI);

    watch.Stop();

    Console.WriteLine("Total fibonacci time: {0}ms", watch.ElapsedMilliseconds);
    Console.ReadLine();
}

虽然我很确定在NUMA环境中这不是很聪明。

Windows 2008 R2有some cool native NUMA functionality，我发现了一个带有.NET包装的promissing codeplex项目：http://multiproc.codeplex.com/

我没有资格教你如何使用这项技术，但这应该指出你正确的方向。

AMD Opteron CPU上的.net代码速度较慢

1 个答案: