如何测试compareAndSet在synchronized之间的性能

时间:2017-05-14 05:46:19

标签: java jmh

我想使用CAS来改进我的代码,但我怀疑它可以获得更好的性能,所以我做了一个测试。这是测试代码,这个jmh代码是可靠的吗?

    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    @BenchmarkMode(Mode.SampleTime)
    @Warmup(iterations = 5)
    @Measurement(iterations = 10, time = 5, timeUnit = TimeUnit.SECONDS)
    @Threads(20)
    @Fork(1)
    @State(Scope.Benchmark)
    public class CASBench {
        private int id=24;
        private static Object[] lockObj;
        private static AtomicReference<Integer>[] locks;
        static {
            lockObj = new Object[100];
            for (int i = 0; i < lockObj.length; i++) {
                lockObj[i] = new Object();
            }

            locks = new AtomicReference[100];
            for (int i = 0; i < locks.length; i++) {
                locks[i] = new AtomicReference<Integer>(null);
            }
        }
        @Benchmark
        public void sync() throws Exception {
            int index = id % 100;
            synchronized (lockObj[index]) {
                test();
            }
        }
        @Benchmark
        public void cas() throws Exception {
            AtomicReference<Integer> lock = locks[id % 100];
            while (!lock.compareAndSet(null, id)) {
            }
            test();
            lock.compareAndSet(id, null);
        }

        public void test() throws Exception {
            int sum=0;
            for(int i=0;i<100;i++){
                sum += i;
            }
        }
    }

我得到了jmh测试结果:

Benchmark                     Mode       Cnt    Score    Error  Units
CASBench.cas                sample  25866638    0.014 ±  0.001  ms/op
CASBench.cas:cas·p0.00      sample             ≈ 10⁻⁶           ms/op
CASBench.cas:cas·p0.50      sample             ≈ 10⁻⁴           ms/op
CASBench.cas:cas·p0.90      sample              0.001           ms/op
CASBench.cas:cas·p0.95      sample              0.001           ms/op
CASBench.cas:cas·p0.99      sample              0.001           ms/op
CASBench.cas:cas·p0.999     sample              0.002           ms/op
CASBench.cas:cas·p0.9999    sample             38.164           ms/op
CASBench.cas:cas·p1.00      sample            813.695           ms/op
CASBench.sync               sample  26257757    0.011 ±  0.001  ms/op
CASBench.sync:sync·p0.00    sample             ≈ 10⁻⁶           ms/op
CASBench.sync:sync·p0.50    sample             ≈ 10⁻⁴           ms/op
CASBench.sync:sync·p0.90    sample              0.001           ms/op
CASBench.sync:sync·p0.95    sample              0.001           ms/op
CASBench.sync:sync·p0.99    sample              0.005           ms/op
CASBench.sync:sync·p0.999   sample              1.883           ms/op
CASBench.sync:sync·p0.9999  sample             15.270           ms/op
CASBench.sync:sync·p1.00    sample             45.810           ms/op

我可以得出这个结论,在这种情况下,同步更好吗?

1 个答案:

答案 0 :(得分:1)

我的测试确实不正确据我所知。首先,您的基准测试应返回一个值,如样本here中指定的或使用BlackHoles

有两种方法可以测试,首先是contention但没有。{/ p>

让我们看看在争用中会发生什么,它更容易掌握:

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@State(Scope.Benchmark)
public class Contention {

    public static void main(String[] args) throws RunnerException {

        Options opt = new OptionsBuilder()
                .jvmArgs("-ea")
                .shouldFailOnError(true)
                .include(Contention.class.getSimpleName()).build();
        new Runner(opt).run();
    }

    private AtomicInteger atomic;

    private Object lock = new Object();

    private int i = 0;

    @Setup
    public void setUp() {
        atomic = new AtomicInteger(0);
    }

    @Fork(1)
    @Threads(10)
    @Benchmark
    public int incrementAtomic() {
        return atomic.incrementAndGet();
    }

    @Fork(1)
    @Threads(10)
    @Benchmark
    public int incrementSync() {
        synchronized (lock) {
            ++i;
        }

        return i;
    }
}

代码应该是不言自明的;在这里稍作解释:

 State(Scope.Benchmark)

如果您将其更改为:State(Scope.Thread) 每个线程都会获得自己的锁,因此此代码会被biased-locking扭曲。 这意味着如果您将使用以下代码运行此代码:

 State(Scope.Thread)

你的输出会非常相似。像这样:

Benchmark                                     Mode  Cnt   Score   Error  Units
casVSsynchronized.Contention.incrementAtomic  avgt    5  36.526 ± 6.548  ns/op
casVSsynchronized.Contention.incrementSync    avgt    5  23.655 ± 3.393  ns/op

用:

运行它
@State(Scope.Benchmark)

显示完整的不同图片。 在争用情况下CAS会更好地,您可以从结果中看到:

Benchmark                                     Mode  Cnt    Score    Error  Units
casVSsynchronized.Contention.incrementAtomic  avgt    5  212.997 ± 42.902  ns/op
casVSsynchronized.Contention.incrementSync    avgt    5  457.896 ± 46.811  ns/op 

比我有一个更复杂的测试(可能需要jmh devs的更多限制性评论):

import java.util.concurrent.TimeUnit;

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
public class CASSync {

    public static void main(String[] args) throws RunnerException {

        Options opt = new OptionsBuilder()
                .jvmArgs("-ea")
                .shouldFailOnError(true)
                .include(CASSync.class.getSimpleName()).build();
        new Runner(opt).run();
    }

    @State(Scope.Thread)
    static public class AtomicHolder {

        AtomicInteger i = null;

        @Setup(Level.Invocation)
        public void setUp() {
            i = new AtomicInteger(0);
        }

        @TearDown(Level.Invocation)
        public void tearDown() {
            assert i.intValue() == 1;
            i = null;
        }

    }

    @State(Scope.Thread)
    static public class SyncHolder {

        int i = 0;

        Object lock = null;

        @Setup(Level.Invocation)
        public void setUp() {
            lock = new Object();
            i = 0;
        }

        @TearDown(Level.Invocation)
        public void tearDown() {
            assert i == 1;
            lock = null;
        }

    }

    @Benchmark
    @Fork(1)
    public boolean cas(AtomicHolder holder) {
        return holder.i.compareAndSet(0, 1);
    }

    @Benchmark
    @Fork(1)
    public boolean sync(SyncHolder holder) {
        synchronized (holder.lock) {
            ++holder.i;
        }

        return holder.i == 1;
    }

}

这个测试的情况是根本没有争用(就像第一个一样),但这次摆脱了biased-locking。结果:

 Benchmark                       Mode  Cnt   Score   Error  Units
 casVSsynchronized.CASSync.cas   avgt    5  44.003 ± 1.343  ns/op
 casVSsynchronized.CASSync.sync  avgt    5  50.744 ± 1.370  ns/o

我的结论:对于竞争环境,CAS更好。对于其他人来说,这是值得商榷的。