Question

在一个用例中，我有一个包含1个条目的HashMap，密钥是A.我需要多次使用密钥B调用get（）。等于（）到B但A和B不是同一个对象。 Key包含一个long数组，因此它的equals（）很昂贵。我正在尝试提高此地图检查操作的性能。我知道有适当的方法来解决性能问题。但是，我正在考虑最方便的黑客攻击。

以下内容来自HashMap.java：

    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

如果我将for循环中的if块更改为：

        if (e.hash == hash) {
            if (e.key == key) {
                return e.value; 
            } else if (e.key.equals(key)) {
                e.key = (K) key;
                return e.value;
            }
        }

我认为这对性能有很大帮助。我第一次用键B调用get（）时，将调用B的equals（）。其余时间，B将= =到地图中的键，从而保存equals（）调用。

但是，由于HashMap.field受包保护且Entry.key是最终的，因此无法扩展HashMap并覆盖get（）。

问题：

这个方案有用吗？
复制HashMap.java及其相关代码只是为了改变一种方法并不是很吸引人。实施此黑客攻击的最佳方法是什么？

谢谢！

Answer 1

这是一个可怕的想法。你正在改变一个条目下的条目。

解决方案是创建自己的内部“身份哈希值”，您可以计算并保证每个值都是唯一的。然后将其用作equals()方法中昂贵比较的代理。

例如（伪Java）：

class ExpensiveEquals
{
    private class InxpensiveEqualsIdentity
    {
        ...
        public InexpensiveEqualsIdentity(ExpensiveEquals obj) { ... }
        public boolean equals() { an inexpensive comparison }
    }
    private InxpensiveEqualsIdentity identity;
    public ExpensiveEquals(...)
    {
        ... fill in the object
        this.identity = new InexpensiveEqualsIdentity(this);
    }
    public int hashCode() { return this.identity.hashCode(); }
    public boolean equals(Object o)
    {
        if (this == o) return true;
        if (o == null || !o instanceof this.getClass()) return false;
        return (this.identity.equals(((ExpensiveEquals)o).identity));
    }
}

Answer 2

是的，如果equals正确实施（对称），这应该可行。

尝试在地图键的类中破解equals方法：

equals(Object obj) {
    if (this == obj) return true;
    if (obj == null) return false;
    if (!(obj instanceof MyClass)) return false;
    MyClass other = (MyClass) obj;
    if (this.longArray == other.longArray) return true;
    if (Arrays.equals(this.longArray, other.longArray)) {
        this.longArray = other.longArray;
        return true;
    }
    return false;
}

由于你的类是不可变的，这个技巧应该是安全的。你应该让longArray字段不是最终的，但是我保证不会影响性能。

Answer 3

如果B是您真正感兴趣的关键，您可以从外部执行交换。

V val = map.remove(b);
map.put(b, val);

从那时起，B的引用相等就足够了，但你并没有充分利用内部机制。

Answer 4

我的简单懒惰的想法建立在@ JimGarrison的回答之上：

private long hash0, hash1;

void initHash() {
    // Compute a hash using md5 and store it in hash0 and hash1
    // The collision probability for two objects is 2**-128, i.e., very small,
    //   and grows with the square of the number of objects.
    // Use SHA-1 if you're scared.
}

void assureHash() {
    if (hash0 == 0 && hash1 == 0) initHash();
}


public int hashCode() {
    // If both hashes are zero, assume it wasn't computed yet.
    assureHash();
    return (int) hash0;
}

public boolean equals(Object o) {
    if (this == o) return true;
    if (!(o instanceof ExpensiveEquals)) return false;
    ExpensiveEquals that = (ExpensiveEquals) o;
    this.assureHash();
    that.assureHash();
    return this.hash0 == that.hash0 && this.hash1 == that.hash1;
}

这确保所有等于调用，但第一个将非常便宜。即使在假设数千个物体并将生日悖论考虑在内时，两个随机选择的长对等于的变化也是微不足道的。使用加密哈希函数，数字与随机数一样好。

如果md5和两个long不够好，请使用SHA-1和另外一个int（这是git所做的）。

get（）期间的Java HashMap键交换

4 个答案: