C#线程安全的getter性能差异

时间:2013-04-11 13:57:02

标签: c# performance locking

我正在编写一个线程安全对象,它基本上代表一个double并使用一个锁来确保安全的读写。我在一段代码中使用了许多这些对象(20-30),每秒读取和写入它们100次,我正在测量每个时间步长的平均计算时间。我开始考虑一些选项来实现我的getter,在运行了很多测试并收集了许多样本来平均我的计算时间测量后,我发现某些实现的执行始终比其他实现更好,但不是我期望的实现。

实施1)计算时间平均值= 0.607ms:

protected override double GetValue()
{
    lock(_sync)
    {
        return _value;
    }
}

实施2)计算时间平均值= 0.615ms:

protected override double GetValue()
{
    double result;
    lock(_sync)
    {
        result = _value;
    }
    return result;
}

实施3)计算时间平均值= 0.560ms:

protected override double GetValue()
{
    double result = 0;
    lock(_sync)
    {
        result = _value;
    }
    return result;
}

我的期望:我原本期望看到实现3是3中最差的(这实际上是我的原始代码,所以我用这种方式编写了机会或延迟编码)但令人惊讶的是,它在性能方面始终是最好的。我希望实现1是最快的。我还期望实现2至少与实现3一样快,因为我只是删除了对被覆盖的双重结果的赋值,所以这是不必要的。

我的问题是:任何人都可以解释为什么这3个实现具有我测量的相对性能?这对我来说似乎是违反直觉的,我真的很想知道原因。

我意识到这些差异并不重要,但每次运行测试时它们的相对测量值都是一致的,每次测试都会收集数千个样本来平均计算时间。此外,请记住我正在进行这些测试,因为我的应用程序需要非常高的性能,或者至少与我能合理地获得它一样好。我的测试用例只是一个小测试用例,我的代码在发布时运行时性能非常重要。

编辑:请注意我正在使用MonoTouch并在iPad Mini设备上运行代码,因此它可能与c#无关,而且与MonoTouch的交叉编译器有关。

2 个答案:

答案 0 :(得分:15)

坦率地说,这里有其他更好的方法。以下输出(忽略x1,用于JIT):

x5000000
Example1        128ms
Example2        136ms
Example3        129ms
CompareExchange 53ms
ReadUnsafe      54ms
UntypedBox      23ms
TypedBox        12ms

x5000000
Example1        129ms
Example2        129ms
Example3        129ms
CompareExchange 52ms
ReadUnsafe      53ms
UntypedBox      23ms
TypedBox        12ms

x5000000
Example1        129ms
Example2        161ms
Example3        129ms
CompareExchange 52ms
ReadUnsafe      53ms
UntypedBox      23ms
TypedBox        12ms

所有这些都是线程安全的实现。如您所见,最快的是打字框,后跟无类型(object)框。接下来(速度大致相同)Interlocked.CompareExchange / Interlocked.Read - 请注意后者只支持long,所以我们需要做一些抨击来将其视为{{1} }}

显然,测试你的目标框架。

为了好玩,我还测试了double;在相同的规模测试中,大约需要3300毫秒。

Mutex

答案 1 :(得分:6)

仅测量并发读取是误导性的,您的缓存将比实际用例提供更好的结果。所以我将SetValue添加到Marc的例子中:

using System;
using System.Diagnostics;
using System.Threading;

abstract class Experiment
{
    public abstract double GetValue();
    public abstract void SetValue(double value);
}

class Example1 : Experiment
{
    private readonly object _sync = new object();
    private double _value = 3;
    public override double GetValue()
    {
        lock (_sync)
        {
            return _value;
        }
    }

    public override void SetValue(double value)
    {
        lock (_sync)
        {
            _value = value;
        }

    }

}
class Example2 : Experiment
{
    private readonly object _sync = new object();
    private double _value = 3;
    public override double GetValue()
    {
        lock (_sync)
        {
            return _value;
        }
    }

    public override void SetValue(double value)
    {
        lock (_sync)
        {
            _value = value;
        }
    }

}



class Example3 : Experiment
{
    private readonly object _sync = new object();
    private double _value = 3;
    public override double GetValue()
    {
        double result = 0;
        lock (_sync)
        {
            result = _value;
        }
        return result;
    }

    public override void SetValue(double value)
    {
        lock (_sync)
        {
            _value = value;
        }
    }
}

class CompareExchange : Experiment
{
    private double _value = 3;
    public override double GetValue()
    {
        return Interlocked.CompareExchange(ref _value, 0, 0);
    }

    public override void SetValue(double value)
    {
        Interlocked.Exchange(ref _value, value);
    }
}
class ReadUnsafe : Experiment
{
    private long _value = DoubleToInt64(3);
    static unsafe long DoubleToInt64(double val)
    {   // I'm mainly including this for the field initializer
        // in real use this would be manually inlined
        return *(long*)(&val);
    }
    public override unsafe double GetValue()
    {
        long val = Interlocked.Read(ref _value);
        return *(double*)(&val);
    }

    public override void SetValue(double value)
    {
        long intValue = DoubleToInt64(value);
        Interlocked.Exchange(ref _value, intValue);
    }
}
class UntypedBox : Experiment
{
    // references are always atomic
    private volatile object _value = 3.0;
    public override double GetValue()
    {
        return (double)_value;
    }

    public override void SetValue(double value)
    {
        object valueObject = value;
        _value = valueObject;
    }
}
class TypedBox : Experiment
{
    private sealed class Box
    {
        public readonly double Value;
        public Box(double value) { Value = value; }

    }
    // references are always atomic
    private volatile Box _value = new Box(3);
    public override double GetValue()
    {
        Box value = _value;
        return value.Value;
    }

    public override void SetValue(double value)
    {
        Box boxValue = new Box(value);
        _value = boxValue;
    }
}
static class Program
{
    static void Main()
    {
        // once for JIT
        RunExperiments(1);
        // three times for real
        RunExperiments(5000000);
        RunExperiments(5000000);
        RunExperiments(5000000);
    }
    static void RunExperiments(int loop)
    {
        Console.WriteLine("x{0}", loop);
        RunExperiment(new Example1(), loop);
        RunExperiment(new Example2(), loop);
        RunExperiment(new Example3(), loop);
        RunExperiment(new CompareExchange(), loop);
        RunExperiment(new ReadUnsafe(), loop);
        RunExperiment(new UntypedBox(), loop);
        RunExperiment(new TypedBox(), loop);
        Console.WriteLine();
    }
    static void RunExperiment(Experiment test, int loop)
    {
        // avoid any GC interruptions
        GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
        GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
        GC.WaitForPendingFinalizers();

        int threads = Environment.ProcessorCount;

        ManualResetEvent done = new ManualResetEvent(false);

        // Since we use threads, divide the original workload
        //
        int workerLoop = Math.Max(1, loop / Environment.ProcessorCount);
        int writeRatio = 1000;
        int writes = Math.Max(workerLoop / writeRatio, 1);
        int reads = workerLoop / writes;

        var watch = Stopwatch.StartNew();

        for (int t = 0; t < Environment.ProcessorCount; ++t)
        {
            ThreadPool.QueueUserWorkItem((state) =>
                {
                    try
                    {
                        double val = 0;

                        // Two loops to avoid comparison for % in the inner loop
                        //
                        for (int j = 0; j < writes; ++j)
                        {
                            test.SetValue(j);
                            for (int i = 0; i < reads; i++)
                            {
                                val = test.GetValue();
                            }
                        }
                    }
                    finally
                    {
                        if (0 == Interlocked.Decrement(ref threads))
                        {
                            done.Set();
                        }
                    }
                });
        }
        done.WaitOne();
        watch.Stop();
        Console.WriteLine("{0}\t{1}ms", test.GetType().Name,
            watch.ElapsedMilliseconds);

    }
}

结果是,1000:1读取:写入比率:

x5000000
Example1        353ms
Example2        395ms
Example3        369ms
CompareExchange 150ms
ReadUnsafe      161ms
UntypedBox      11ms
TypedBox        9ms

100:1(读:写)

x5000000
Example1        356ms
Example2        360ms
Example3        356ms
CompareExchange 161ms
ReadUnsafe      172ms
UntypedBox      14ms
TypedBox        13ms

10:1(读:写)

x5000000
Example1        383ms
Example2        394ms
Example3        414ms
CompareExchange 169ms
ReadUnsafe      176ms
UntypedBox      41ms
TypedBox        43ms

2:1(读:写)

x5000000
Example1        550ms
Example2        581ms
Example3        560ms
CompareExchange 257ms
ReadUnsafe      292ms
UntypedBox      101ms
TypedBox        122ms

1:1(读:写)

x5000000
Example1        718ms
Example2        745ms
Example3        730ms
CompareExchange 381ms
ReadUnsafe      376ms
UntypedBox      161ms
TypedBox        200ms

*更新了代码,以便在写入时删除不必要的ICX操作,因为该值总是被覆盖。还修改了公式以计算按线程划分的读取次数(相同的工作)。