
时间:2011-08-18 23:40:47

标签: java hashmap

我正在尝试找出特定情况下的最佳容量和负载系数。我想我已经掌握了它的要点,但我还是要感谢那些比我更了解的人的确认。 :)



  • 奇怪的是,容量,容量+ 1,容量+2,容量-1和容量-10都产生完全相同的结果。我预计至少容量-1和容量10会产生更差的结果。
  • 使用初始容量(而不是使用默认值16)可以显着提高put()效率 - 最多可提高30%。
  • 使用1的加载因子可为少量对象提供相同的性能,并为大量对象提供更好的性能(> 100000)。但是,这并没有与物体数量成比例地改善;我怀疑还有其他因素会影响结果。
  • get()对于不同数量的对象/容量,性能略有不同,但尽管它可能因具体情况而略有不同,但通常不会受初始容量或负载因素的影响。


所以,让我们看看我得到了什么。以下两个图表显示了负载系数的差异。第一张图表显示了当HashMap填满容量时会发生什么;由于调整大小,负载系数0.75表现更差。然而,它并不总是更糟糕,并且有各种各样的颠簸和跳跃 - 我想GC在这方面有重大影响。载荷系数1.25与1相同,因此它不包括在图表中。

fully filled


half full


go spike


import java.util.HashMap;
import java.util.Map;

public class HashMapTest {

  // capacity - numbers high as 10000000 require -mx1536m -ms1536m JVM parameters
  public static final int CAPACITY = 10000000;
  public static final int ITERATIONS = 10000;

  // set to false to print put performance, or to true to print get performance
  boolean doIterations = false;

  private Map<Integer, String> cache;

  public void fillCache(int capacity) {
    long t = System.currentTimeMillis();
    for (int i = 0; i <= capacity; i++)
      cache.put(i, "Value number " + i);

    if (!doIterations) {
      System.out.print(System.currentTimeMillis() - t);

  public void iterate(int capacity) {
    long t = System.currentTimeMillis();

    for (int i = 0; i <= ITERATIONS; i++) {
      long x = Math.round(Math.random() * capacity);
      String result = cache.get((int) x);

    if (doIterations) {
      System.out.print(System.currentTimeMillis() - t);

  public void test(float loadFactor, int divider) {
    for (int i = 10000; i <= CAPACITY; i+= 10000) {
      cache = new HashMap<Integer, String>(i, loadFactor);
      fillCache(i / divider);
      if (doIterations)
        iterate(i / divider);

  public static void main(String[] args) {
    HashMapTest test = new HashMapTest();

    // fill to capacity
    test.test(0.75f, 1);
    test.test(1, 1);
    test.test(1.25f, 1);

    // fill to half capacity
    test.test(0.75f, 2);
    test.test(1, 2);
    test.test(1.25f, 2);


5 个答案:

答案 0 :(得分:69)

答案 1 :(得分:10)



奇怪的是,容量,容量+ 1,容量+2,容量-1和容量-10都产生完全相同的结果。我预计至少容量-1和容量-10会产生更糟糕的结果。


This is the constructor right for the JDK source:

 * Constructs an empty <tt>HashMap</tt> with the specified initial
 * capacity and load factor.
 * @param  initialCapacity the initial capacity
 * @param  loadFactor      the load factor
 * @throws IllegalArgumentException if the initial capacity is negative
 *         or the load factor is nonpositive
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +

    // Find a power of 2 >= initialCapacity
    int capacity = 1;
    while (capacity < initialCapacity)
        capacity <<= 1;

    this.loadFactor = loadFactor;
    threshold = (int)(capacity * loadFactor);
    table = new Entry[capacity];

答案 2 :(得分:2)




首先,我假设您的HashMap不会超过100; 如果是,则应保留负载因子。同样,如果您的关注点是性能,将负载因子保留为。如果你担心的是内存,你可以通过设置静态大小来保存一些内存。如果你在内存中填充了很多东西,这个可能可能值得做。即,存储许多地图,或创建堆空间压力大小的地图。

其次,我选择了值101,因为它提供了更好的可读性...如果我之后查看您的代码,并且看到您已将初始容量设置为100并且您'使用100元素重新加载它,我将不得不通读Javadoc以确保它在精确到达100时不会调整大小。当然,我不会在那里找到答案,所以我将不得不查看来源。这是不值得的...只要留下它101,每个人都很高兴,没有人看到java.util.HashMap的源代码。 Hoorah。

第三,声称将HashMap设置为您对加载因子1 "will kill your lookup and insertion performance"所期望的精确容量的说法是不正确的,即使它是以粗体显示的。




static Random r = new Random();

public static void main(String[] args){
    int[] tests = {100, 1000, 10000};
    int runs = 5000;

    float lf_sta = 1f;
    float lf_dyn = 0.75f;

    for(int t:tests){
        System.err.println("=======Test Put "+t+"");
        HashMap<Integer,Integer> map = new HashMap<Integer,Integer>();
        long norm_put = testInserts(map, t, runs);
        System.err.print("Norm put:"+norm_put+" ms. ");

        int cap_sta = t;
        map = new HashMap<Integer,Integer>(cap_sta, lf_sta);
        long sta_put = testInserts(map, t, runs);
        System.err.print("Static put:"+sta_put+" ms. ");

        int cap_dyn = (int)Math.ceil((float)t/lf_dyn);
        map = new HashMap<Integer,Integer>(cap_dyn, lf_dyn);
        long dyn_put = testInserts(map, t, runs);
        System.err.println("Dynamic put:"+dyn_put+" ms. ");

    for(int t:tests){
        System.err.println("=======Test Get (hits) "+t+"");
        HashMap<Integer,Integer> map = new HashMap<Integer,Integer>();
        fill(map, t);
        long norm_get_hits = testGetHits(map, t, runs);
        System.err.print("Norm get (hits):"+norm_get_hits+" ms. ");

        int cap_sta = t;
        map = new HashMap<Integer,Integer>(cap_sta, lf_sta);
        fill(map, t);
        long sta_get_hits = testGetHits(map, t, runs);
        System.err.print("Static get (hits):"+sta_get_hits+" ms. ");

        int cap_dyn = (int)Math.ceil((float)t/lf_dyn);
        map = new HashMap<Integer,Integer>(cap_dyn, lf_dyn);
        fill(map, t);
        long dyn_get_hits = testGetHits(map, t, runs);
        System.err.println("Dynamic get (hits):"+dyn_get_hits+" ms. ");

    for(int t:tests){
        System.err.println("=======Test Get (Rand) "+t+"");
        HashMap<Integer,Integer> map = new HashMap<Integer,Integer>();
        fill(map, t);
        long norm_get_rand = testGetRand(map, t, runs);
        System.err.print("Norm get (rand):"+norm_get_rand+" ms. ");

        int cap_sta = t;
        map = new HashMap<Integer,Integer>(cap_sta, lf_sta);
        fill(map, t);
        long sta_get_rand = testGetRand(map, t, runs);
        System.err.print("Static get (rand):"+sta_get_rand+" ms. ");

        int cap_dyn = (int)Math.ceil((float)t/lf_dyn);
        map = new HashMap<Integer,Integer>(cap_dyn, lf_dyn);
        fill(map, t);
        long dyn_get_rand = testGetRand(map, t, runs);
        System.err.println("Dynamic get (rand):"+dyn_get_rand+" ms. ");

public static long testInserts(HashMap<Integer,Integer> map, int test, int runs){
    long b4 = System.currentTimeMillis();

    for(int i=0; i<runs; i++){
        fill(map, test);
    return System.currentTimeMillis()-b4;

public static void fill(HashMap<Integer,Integer> map, int test){
    for(int j=0; j<test; j++){
        if(map.put(r.nextInt(), j)!=null){

public static long testGetHits(HashMap<Integer,Integer> map, int test, int runs){
    long b4 = System.currentTimeMillis();

    ArrayList<Integer> keys = new ArrayList<Integer>();

    for(int i=0; i<runs; i++){
        for(int j=0; j<test; j++){
    return System.currentTimeMillis()-b4;

public static long testGetRand(HashMap<Integer,Integer> map, int test, int runs){
    long b4 = System.currentTimeMillis();

    for(int i=0; i<runs; i++){
        for(int j=0; j<test; j++){
    return System.currentTimeMillis()-b4;


=======Test Put 100
Norm put:78 ms. Static put:78 ms. Dynamic put:62 ms. 
=======Test Put 1000
Norm put:764 ms. Static put:763 ms. Dynamic put:748 ms. 
=======Test Put 10000
Norm put:12921 ms. Static put:12889 ms. Dynamic put:12873 ms. 
=======Test Get (hits) 100
Norm get (hits):47 ms. Static get (hits):31 ms. Dynamic get (hits):32 ms. 
=======Test Get (hits) 1000
Norm get (hits):327 ms. Static get (hits):328 ms. Dynamic get (hits):343 ms. 
=======Test Get (hits) 10000
Norm get (hits):3304 ms. Static get (hits):3366 ms. Dynamic get (hits):3413 ms. 
=======Test Get (Rand) 100
Norm get (rand):63 ms. Static get (rand):46 ms. Dynamic get (rand):47 ms. 
=======Test Get (Rand) 1000
Norm get (rand):483 ms. Static get (rand):499 ms. Dynamic get (rand):483 ms. 
=======Test Get (Rand) 10000
Norm get (rand):5190 ms. Static get (rand):5362 ms. Dynamic get (rand):5236 ms. 

re:↑ - 有关于此→||←不同设置之间的差异

关于我的原始答案(位于第一条水平线以上的位置),它是故意的glib,因为在大多数情况下this type of micro-optimising is not good

答案 3 :(得分:2)

在实施方面,Google Guava具有便捷的工厂方法


calculates the capacity使用公式

capacity = expectedSize / 0.75F + 1.0F

答案 4 :(得分:1)

来自HashMap JavaDoc:


因此,如果您期望100个条目,那么负载因子0.75和初始容量上限(100 / 0.75)将是最佳的。这可以归结为134。

