为什么HashMap包含比在Sun JDK中获取更慢的键? (太阳的jdk-1.6.0.17)

时间:2011-11-03 17:48:08

标签: java hashmap jdk1.6

为什么在HashMap上调用containsKey的速度比get慢?

测试:http://ideone.com/QsWXF(差异大于15%,在sun-jdk-1.6.0.17上运行)

4 个答案:

答案 0 :(得分:10)

因为它的工作量稍微多一点,所以请参阅the OpenJDK 7 source


请注意,containsKey调用getEntry,而get直接“执行魔术查找”。 我不知道为什么会这样做,而且使用/不使用getForNullKey进一步感到困惑:请参阅John B和Ted Hopps关于为什么要这样做的评论。 / p>

get为空键提供了早期代码拆分(请注意,如果条目不存在或存在空值,则get将返回null:< / p>

315           if (key == null)
316               return getForNullKey();
...
322               if (e.hash == hash &&
                      ((k = e.key) == key || key.equals(k)))
323                   return e.value;

虽然从getEntry调用的containsKey未分割为getForNullKey,但此处还有其他工作要检查空键案例(对于链中扫描的每个条目) :

366               if (e.hash == hash &&
367                   ((k = e.key) == key || (key != null && key.equals(k))))
368                   return e;

此外,containsKey还有附加的条件和方法调用(请注意getEntry将返回一个Entry对象,如果存在所述键,即使存储的值为null):< / p>

352           return getEntry(key) != null;

我认为可以认为containsKey会在“表现”方面受益 - 从拥有专门表格(以较少干的代码为代价)或getEntry可以遵循get带领早期空键检查......但另一方面,可能会认为get应该用getEntry; - )

快乐的编码。

答案 1 :(得分:3)

我还没有尝试重现你的结果,但我的第一个猜测是get只返回找到的值(如果有的话),而containsKey(这是你测试的, not contains)需要测试密钥是否存在(带有null或非null值),然后返回一个布尔值。只需要多一点开销。

答案 2 :(得分:3)

让我们看一下源代码:

public V get(Object key) {
    if (key == null)
        return getForNullKey();
    int hash = hash(key.hashCode());
    for (Entry<K,V> e = table[indexFor(hash, table.length)];
         e != null;
         e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
            return e.value;
    }
    return null;
}


public boolean containsKey(Object key) {
    return getEntry(key) != null;
}

final Entry<K,V> getEntry(Object key) {
    int hash = (key == null) ? 0 : hash(key.hashCode());
    for (Entry<K,V> e = table[indexFor(hash, table.length)];
         e != null;
         e = e.next) {
        Object k;
        if (e.hash == hash &&
            ((k = e.key) == key || (key != null && key.equals(k))))
            return e;
    }
    return null;
}

也许是因为额外的方法调用或因为key != null的重复检查?

答案 3 :(得分:3)

测试不同大小的哈希映射,如果性能存在偏差,则非常小。

使用4.6 GHz i7 2600K在Java 7更新1上运行。

public class HashMapPerfMain {
    public static void main(String... args) {
        Integer[] keys = generateKeys(2 * 1000 * 1000);

        Map<Integer, Boolean> map = new HashMap<Integer, Boolean>();
        for (int j = 0; j < keys.length; j += 2)
            map.put(keys[j], true);

        for (int t = 0; t < 5; t++) {
            long start = System.nanoTime();
            int count = countContainsKey(map, keys);
            long time = System.nanoTime() - start;
            assert count == keys.length / 2;

            long start2 = System.nanoTime();
            int count2 = countGetIsNull(map, keys);
            long time2 = System.nanoTime() - start2;
            assert count2 == keys.length / 2;
            System.out.printf("Map.containsKey avg %.1f ns, ", (double) time / keys.length);
            System.out.printf("Map.get() == null avg %.1f ns, ", (double) time2 / keys.length);
            System.out.printf("Ratio was %.2f%n", (double) time2/ time);
        }
    }

    private static int countContainsKey(Map<Integer, Boolean> map, Integer[] keys) {
        int count = 0;
        for (Integer i : keys) {
            if (map.containsKey(i)) count++;
        }
        return count;
    }

    private static int countGetIsNull(Map<Integer, Boolean> map, Integer[] keys) {
        int count = 0;
        for (Integer i : keys) {
            if (map.get(i) == null) count++;
        }
        return count;
    }

    private static Integer[] generateKeys(int size) {
        Integer[] keys = new Integer[size];
        Random random = new Random();
        for (int i = 0; i < keys.length; i++)
            keys[i] = random.nextInt();
        return keys;
    }
}

打印五十万个键

Map.containsKey avg 27.1 ns, Map.get() == null avg 26.4 ns, Ratio was 0.97
Map.containsKey avg 19.6 ns, Map.get() == null avg 19.6 ns, Ratio was 1.00
Map.containsKey avg 18.3 ns, Map.get() == null avg 19.0 ns, Ratio was 1.04
Map.containsKey avg 18.2 ns, Map.get() == null avg 19.1 ns, Ratio was 1.05
Map.containsKey avg 18.3 ns, Map.get() == null avg 19.0 ns, Ratio was 1.04

打印一百万个密钥

Map.containsKey avg 30.9 ns, Map.get() == null avg 30.9 ns, Ratio was 1.00
Map.containsKey avg 26.0 ns, Map.get() == null avg 25.5 ns, Ratio was 0.98
Map.containsKey avg 25.0 ns, Map.get() == null avg 24.9 ns, Ratio was 1.00
Map.containsKey avg 25.0 ns, Map.get() == null avg 24.9 ns, Ratio was 1.00
Map.containsKey avg 24.8 ns, Map.get() == null avg 25.0 ns, Ratio was 1.01

然而,有两百万把钥匙

Map.containsKey avg 36.5 ns, Map.get() == null avg 36.7 ns, Ratio was 1.00
Map.containsKey avg 34.3 ns, Map.get() == null avg 35.1 ns, Ratio was 1.02
Map.containsKey avg 36.7 ns, Map.get() == null avg 35.1 ns, Ratio was 0.96
Map.containsKey avg 36.3 ns, Map.get() == null avg 35.1 ns, Ratio was 0.97
Map.containsKey avg 36.7 ns, Map.get() == null avg 35.2 ns, Ratio was 0.96

五百万把钥匙

Map.containsKey avg 40.1 ns, Map.get() == null avg 40.9 ns, Ratio was 1.02
Map.containsKey avg 38.6 ns, Map.get() == null avg 40.4 ns, Ratio was 1.04
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.3 ns, Ratio was 0.97
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.3 ns, Ratio was 0.98
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.8 ns, Ratio was 0.99
BTW:get()和containsKey的时间复杂度是O(1)(在理想化的机器上),但是你可以看到,对于真实的机器,成本会随着Map的大小而增加。