多键映射 - 性能比较

时间:2015-08-11 12:27:56

标签: java performance hashmap koloboke

上下文

我们的应用程序在许多不同类型的地图中将大量数据存储在内存中,以便快速查找。 为了保持简单(而不是考虑原始地图),它总是带有一个或多个键的地图。 性能是我们的一大要求。

问题

我想找到效果最好的地图实现,并按照建议here,我对这些实现进行了比较:

  1. 基于java.util.HashMap专门针对3个键的地图(嵌套地图)地图:

    Map<K1, Map<K2, Map<K3, V>>>
    
  2. java.util.HashMap中的包装密钥(元组作为键)

    Map<Triple<K1, K2, K3>, V>
    
  3. 元组作为net.openhft.koloboke.collect.map.hash.HashObjObjMap中的键,根据this应该是(最快)地图之一。

    HashObjObjMap<Triple<K1, K2, K3>, V>
    
  4. 期望

    1. 嵌套地图将具有最快的GET和最慢的PUT。
    2. Koloboke哈希映射将比jdk HashMap更快。
    3. 结果

      Benchmark                                                Mode  Cnt   Score   Error  Units
      TupleVsNestedMapsBenchmark.benchGetFromNestedMap         avgt   20  11.586 ± 0.205  ns/op
      TupleVsNestedMapsBenchmark.benchGetFromTupleKolobokeMap  avgt   20  18.619 ± 0.113  ns/op
      TupleVsNestedMapsBenchmark.benchGetFromTupleMap          avgt   20   8.985 ± 0.085  ns/op
      TupleVsNestedMapsBenchmark.benchPutToNestedMap           avgt   20  15.106 ± 0.142  ns/op
      TupleVsNestedMapsBenchmark.benchPutToTupleKolobokeMap    avgt   20  22.533 ± 0.335  ns/op
      TupleVsNestedMapsBenchmark.benchPutToTupleMap            avgt   20   8.884 ± 0.084  ns/op
      

      基准

      @OutputTimeUnit(TimeUnit.NANOSECONDS)
      @BenchmarkMode(Mode.AverageTime)
      @OperationsPerInvocation(100000)
      @Fork(1)
      @Warmup(iterations = 10)
      @Measurement(iterations = 20)
      public class TupleVsNestedMapsBenchmark {
      
      public static final int N = 10000;
      
      static ObjObjObjObjHashMap<String, String, String, Integer> sourceNestedMap = new ObjObjObjObjHashMap<>();
      static Map<Triple<String, String, String>, Integer> sourceTupleMap = new HashMap<>();
      static HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap = HashObjObjMaps.newMutableMap();
      
      static {
          for (int i = 0; i < N; i++) {
              sourceNestedMap.put("a-" + i, "b-" + i, "c-" + i, i);
              sourceTupleMap.put(ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i), i);
              sourceTupleKMap.put(ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i), i);
          }
      }
      
      @Benchmark
      public List<Integer> benchGetFromNestedMap() {
          return benchmarkGet(sourceNestedMap::get);
      }
      
      @Benchmark
      public List<Integer> benchGetFromTupleMap() {
          return benchmarkGet(((key1, key2, key3) -> sourceTupleMap.get(ImmutableTriple.of(key1, key2, key3))));
      }
      
      @Benchmark
      public List<Integer> benchGetFromTupleKolobokeMap() {
          return benchmarkGet(((key1, key2, key3) -> sourceTupleKMap.get(ImmutableTriple.of(key1, key2, key3))));
      }
      
      @Benchmark
      public ObjObjObjObjHashMap<String, String, String, Integer> benchPutToNestedMap() {
          ObjObjObjObjHashMap<String, String, String, Integer> map = new ObjObjObjObjHashMap<>();
      
          benchmarkPut(map::put);
      
          return map;
      }
      
      @Benchmark
      public Map<Triple<String, String, String>, Integer> benchPutToTupleMap() {
          Map<Triple<String, String, String>, Integer> map = new HashMap<>();
      
          benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
      
          return map;
      }
      
      @Benchmark
      public Map<Triple<String, String, String>, Integer> benchPutToTupleKolobokeMap() {
          HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
      
          benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
      
          return map;
      }
      
      private List<Integer> benchmarkGet(MapValueSupplier<Integer> mapValueSupplier) {
          List<Integer> result = new ArrayList<>(N);
          for (int i = 0; i < N; i++) {
              result.add(mapValueSupplier.supply("a-" + i, "b-" + i, "c-" + i));
      
          }
          return result;
      }
      
      private void benchmarkPut(PutValueFunction<Integer> putValueFunction) {
          for (int i = 0; i < N; i++) {
              putValueFunction.apply("a-" + i, "b-" + i, "c-" + i, i);
          }
      }
      
      private interface MapValueSupplier<T> {
      
          T supply(String key1, String key2, String key3);
      }
      
      private interface PutValueFunction<T> {
      
          void apply(String key1, String key2, String key3, T value);
      }
      }
      

      注意:请不要建议使用原始地图。 Integer as(value)只是廉价对象的一个​​例子。

      问题

      1. 为什么koloboke地图比jdk地图慢2.5倍?
      2. 为什么嵌套地图不会更快? (我希望元组密钥对象的分配开销会更大。)
      3. 或者我的基准是错的?然后,我该如何改进呢?
      4. 更新

        基于@leventov的好建议,我更改了Benchmark,并尝试了缓存哈希码的Triple实现(并且具有更好的分布) - 测试命名为Tuple2。

        @State(Scope.Thread)
        @OutputTimeUnit(TimeUnit.NANOSECONDS)
        @BenchmarkMode(Mode.AverageTime)
        @OperationsPerInvocation(TupleVsNestedMapsBenchmark.TOTAL_OPS)
        @Fork(1)
        @Warmup(iterations = 5)
        @Measurement(iterations = 20)
        public class TupleVsNestedMapsBenchmark {
        
        static final int N = 30;
        static final int TOTAL_OPS = N * N * N;
        
        private ObjObjObjObjHashMap<String, String, String, Integer> sourceNestedMap;
        private Map<Triple<String, String, String>, Integer> sourceTupleMap;
        private HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap;
        private Map<Triple<String, String, String>, Integer> sourceTuple2Map;
        private HashObjObjMap<Triple<String, String, String>, Integer> sourceTuple2KMap;
        private String[] keys;
        
        @Setup
        public void init() {
            sourceNestedMap = new ObjObjObjObjHashMap<>();
            sourceTupleMap = new HashMap<>(TOTAL_OPS);
            sourceTupleKMap = HashObjObjMaps.newMutableMap(TOTAL_OPS);
            sourceTuple2Map = new HashMap<>(TOTAL_OPS);
            sourceTuple2KMap = HashObjObjMaps.newMutableMap(TOTAL_OPS);
            keys = new String[N];
            for (int i = 0; i < N; i++) {
                keys[i] = "k" + i;
            }
            for (int i = 0; i < N; i++) {
                for (int j = 0; j < N; j++) {
                    for (int k = 0; k < N; k++) {
                        sourceNestedMap.put(keys[i], keys[j], keys[k], i);
                        sourceTupleMap.put(ImmutableTriple.of(keys[i], keys[j], keys[k]), i); 
                        sourceTupleKMap.put(ImmutableTriple.of(keys[i], keys[j], keys[k]), i); 
                        sourceTuple2Map.put(ImmutableTriple2.of(keys[i], keys[j], keys[k]), i);
                        sourceTuple2KMap.put(ImmutableTriple2.of(keys[i], keys[j], keys[k]), i);
                    }
                }
            }
        }
        
        @Benchmark
        public List<Integer> benchGetFromNestedMap() {
            return benchmarkGet(sourceNestedMap::get);
        }
        
        @Benchmark
        public List<Integer> benchGetFromTupleMap() {
            return benchmarkGet(((key1, key2, key3) -> sourceTupleMap.get(ImmutableTriple.of(key1, key2, key3))));
        }
        
        @Benchmark
        public List<Integer> benchGetFromTupleKolobokeMap() {
            return benchmarkGet(((key1, key2, key3) -> sourceTupleKMap.get(ImmutableTriple.of(key1, key2, key3))));
        }
        
        @Benchmark
        public List<Integer> benchGetFromTuple2Map() {
            return benchmarkGet(((key1, key2, key3) -> sourceTuple2Map.get(ImmutableTriple2.of(key1, key2, key3))));
        }
        
        @Benchmark
        public List<Integer> benchGetFromTuple2KolobokeMap() {
            return benchmarkGet(((key1, key2, key3) -> sourceTuple2KMap.get(ImmutableTriple2.of(key1, key2, key3))));
        }
        
        @Benchmark
        public ObjObjObjObjHashMap<String, String, String, Integer> benchPutToNestedMap() {
            ObjObjObjObjHashMap<String, String, String, Integer> map = new ObjObjObjObjHashMap<>();
            benchmarkPut(map::put);
            return map;
        }
        
        @Benchmark
        public Map<Triple<String, String, String>, Integer> benchPutToTupleMap() {
            Map<Triple<String, String, String>, Integer> map = new HashMap<>();
            benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
            return map;
        }
        
        @Benchmark
        public Map<Triple<String, String, String>, Integer> benchPutToTupleKolobokeMap() {
            HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
            benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
            return map;
        }
        
        @Benchmark
        public Map<Triple<String, String, String>, Integer> benchPutToTuple2Map() {
            Map<Triple<String, String, String>, Integer> map = new HashMap<>();
            benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple2.of(key1, key2, key3), value));
            return map;
        }
        
        @Benchmark
        public Map<Triple<String, String, String>, Integer> benchPutToTuple2KolobokeMap() {
            HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
            benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple2.of(key1, key2, key3), value));
            return map;
        }
        
        private List<Integer> benchmarkGet(MapValueSupplier<Integer> mapValueSupplier) {
            List<Integer> result = new ArrayList<>(TOTAL_OPS);
            for (int i = 0; i < N; i++) {
                for (int j = 0; j < N; j++) {
                    for (int k = 0; k < N; k++) {
                        Integer value = mapValueSupplier.supply(keys[i], keys[j], keys[k]);
                        result.add(value);
                    }
                }
            }
            return result;
        }
        
        private void benchmarkPut(PutValueFunction<Integer> putValueFunction) {
            Integer value = 1;
            for (int i = 0; i < N; i++) {
                for (int j = 0; j < N; j++) {
                    for (int k = 0; k < N; k++) {
                        putValueFunction.apply(keys[i], keys[j], keys[k], value);
                    }
                }
            }
        }
        
        private interface MapValueSupplier<T> {
        
            T supply(String key1, String key2, String key3);
        }
        
        private interface PutValueFunction<T> {
        
            void apply(String key1, String key2, String key3, T value);
        }
        }
        

        结果如下:

        Benchmark                                                 Mode  Cnt      Score      Error  Units
        TupleVsNestedMapsBenchmark.benchGetFromNestedMap          avgt   20     24.524 ±    0.144  ns/op
        TupleVsNestedMapsBenchmark.benchGetFromTuple2KolobokeMap  avgt   20     65.604 ±    1.135  ns/op
        TupleVsNestedMapsBenchmark.benchGetFromTuple2Map          avgt   20     22.653 ±    0.745  ns/op
        TupleVsNestedMapsBenchmark.benchGetFromTupleKolobokeMap   avgt   20  34824.901 ± 1718.183  ns/op
        TupleVsNestedMapsBenchmark.benchGetFromTupleMap           avgt   20   2565.835 ±   57.402  ns/op
        TupleVsNestedMapsBenchmark.benchPutToNestedMap            avgt   20     43.160 ±    0.340  ns/op
        TupleVsNestedMapsBenchmark.benchPutToTuple2KolobokeMap    avgt   20    237.300 ±    3.362  ns/op
        TupleVsNestedMapsBenchmark.benchPutToTuple2Map            avgt   20     40.952 ±    0.535  ns/op
        TupleVsNestedMapsBenchmark.benchPutToTupleKolobokeMap     avgt   20  52315.769 ±  399.769  ns/op
        TupleVsNestedMapsBenchmark.benchPutToTupleMap             avgt   20   3205.538 ±   44.306  ns/op
        

        摘要

        • &#34;元组&#34;如果密钥类的哈希码函数没有缓存和/或分布良好,那么方法会变得非常慢,特别是对于koloboke。
        • 同样结束here(在这个(Obj-Obj)案例中),java.util.HashMap是&#34;极端&#34;快。

3 个答案:

答案 0 :(得分:8)

[回答更新的问题。]

嗯,基准测试仍有问题:

  • 在进行State生命周期时,您应该将状态对象作为参数传递给benhcmark方法(参见下面的代码)。
  • 基准测试put()应该采用不同的方式:1)在@Setup方法中,应该创建集合(具有足够的capacitysize参数) 2)在另一个@Setup(Level.Invocation)方法中,您应该在基准方法中调用collection.clear() 3)测量纯put()

  • 您仍然在基准测试方法中进行了大量分配。这可能是您的情况,但它隐藏了收集性能的贡献。

所以,我写的是:

package tests;

import net.openhft.koloboke.collect.map.hash.HashObjObjMap;
import net.openhft.koloboke.collect.map.hash.HashObjObjMaps;
import org.apache.commons.lang3.tuple.Triple;
import org.openjdk.jmh.annotations.*;

import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.TimeUnit;

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Threads(1)
@Warmup(iterations = 10)
@Measurement(iterations = 20)
@State(Scope.Thread)
public class SoMultiMap {

    public static final int N = Integer.getInteger("runs", 100000);

    private static final double kbk = Double.parseDouble(System.getProperty("kbk", "1.0"));

    static class ImmutableTriple<L, M, R> extends Triple<L, M, R> {
        public final L left;
        public final M middle;
        public final R right;
        private int h;

        public static <L, M, R> ImmutableTriple<L, M, R> of(L left, M middle, R right) {
            return new ImmutableTriple(left, middle, right);
        }

        public ImmutableTriple(L left, M middle, R right) {
            this.left = left;
            this.middle = middle;
            this.right = right;
        }

        public L getLeft() {
            return this.left;
        }

        public M getMiddle() {
            return this.middle;
        }

        public R getRight() {
            return this.right;
        }

        private int innerHash() {
            int h = left.hashCode();
            h *= 1000003;
            h += middle.hashCode();
            h *= 1000003;
            h += right.hashCode();
            return h * 1000003;
        }

        @Override
        public int hashCode() {
            return h != 0 ? h : (h = innerHash());
        }

        @Override
        public boolean equals(Object obj) {
            if (!(obj instanceof ImmutableTriple))
                return super.equals(obj);
            ImmutableTriple triple = (ImmutableTriple) obj;
            if (h != 0 && triple.h != 0 && h != triple.h)
                return false;
            return super.equals(obj);
        }
    }

    ImmutableTriple<String, String, String>[] keys = new ImmutableTriple[N];
    Integer[] values = new Integer[N];
    Map<Triple<String, String, String>, Integer> sourceTupleMap;
    HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap;

    @Setup
    public void fill() {
        sourceTupleMap = new HashMap<>((int) (N / 0.75));
        sourceTupleKMap = HashObjObjMaps.newUpdatableMap((int) (N * kbk));
        for (int i = 0; i < N; i++) {
            keys[i] = ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i);
            values[i] = i;
            sourceTupleKMap.put(keys[i], values[i]);
            sourceTupleMap.put(keys[i], values[i]);
        }
    }

    @Benchmark
    public int tupleHashMapGet(SoMultiMap st) {
        ImmutableTriple<String, String, String>[] keys = st.keys;
        Map<Triple<String, String, String>, Integer> map = st.sourceTupleMap;
        int s = 0;
        for (int i = 0; i < N; i++) {
            s += map.get(keys[i]);
        }
        return s;
    }

    @Benchmark
    public int tupleKolobokeGet(SoMultiMap st) {
        ImmutableTriple<String, String, String>[] keys = st.keys;
        HashObjObjMap<Triple<String, String, String>, Integer> map = st.sourceTupleKMap;
        int s = 0;
        for (int i = 0; i < N; i++) {
            s += map.get(keys[i]);
        }
        return s;
    }

    public static void main(String[] args) {
        SoMultiMap st = new SoMultiMap();
        st.fill();
        st.tupleKolobokeGet(st);
        st.tupleHashMapGet(st);
    }
}

现在有趣的是结果:

使用Java 7u55:

HashMap:  65 +- 6 ns/op
Koloboke: 46 +- 2

使用Java 8u51:

HashMap:  42 +- 0.5
Koloboke: 49 +- 1

所以,我们有一些VM更改,介于两者之间,使HashMap大大加快,Koloboke映射 - 稍慢。这需要调查,我现在没有时间。见https://github.com/OpenHFT/Koloboke/issues/42

另外,请注意以下几点:

  • 在服务器VM上运行基准测试
  • 在运行期间禁用CPU缩放
  • 关闭重型应用程序(浏览器,Intellij等),除非您有16个以上的硬件线程

答案 1 :(得分:1)

基准测试的问题列表:

  • 初始化在静态区域完成,应使用@Setup方法和@State s
  • 完成
  • 在基准测试中进行大量分配,并构建字符串!你实际测量的是什么?
  • 注意一个错误 - N是10K,但operationsPerInvocation是100K,所以实际时间非常抑郁
  • String哈希码+非常差Triple哈希码,导致哈希表中的某些群集
  • 在测试嵌套vs元组时,请注意您已选择所有键的所有部分是唯一的,i。即所有嵌套映射都是具有单个键的映射。这不是你想要的

答案 2 :(得分:0)

三重作为抽象是可以的(至少,我没有看到明显更好的替代方案,你可以覆盖Apache Commons'Triple抽象类来定义更好的hashCode()函数。

final class ImmutableTriple<L, M, R> extends Triple<L, M, R> {
    public final L left;
    public final M middle;
    public final R right;
    private int h;

    public static <L, M, R> ImmutableTriple<L, M, R> of(L left, M middle, R right) {
        return new ImmutableTriple(left, middle, right);
    }

    public ImmutableTriple(L left, M middle, R right) {
        this.left = left;
        this.middle = middle;
        this.right = right;
    }

    public L getLeft() {
        return this.left;
    }

    public M getMiddle() {
        return this.middle;
    }

    public R getRight() {
        return this.right;
    }

    private int innerHash() {
        int h = left.hashCode();
        h *= 1000003;
        h += middle.hashCode();
        h *= 1000003;
        h += right.hashCode();
        return (int) LongHashFunction.murmur_3().hashInt(h);
    }

    @Override
    public int hashCode() {
        return h != 0 ? h : (h = innerHash());
    }
}