Question

我通常让Eclipse为我生成hashCode（）方法，但现在我发现生成的哈希码可能不是很好的迹象。

在哈希集中使用Eclipse生成的hashCode（）方法返回的哈希码导致查找速度比使用手动编码的hashCode（）方法慢6倍。

这是我的测试：

包含三个int字段的类。
Eclipse生成了hashCode（）和equals（）方法。
使用1.000.000个类实例填充HashSet。
在HashSet中查找每个实例
重复查看10次。

代码：

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class MyPojo {

    private final int a;
    private final int b;
    private final int c;

    public MyPojo(int a, int b, int c) {
        super();
        this.a = a;
        this.b = b;
        this.c = c;
    }

    public static void main(String[] args) {

        List<MyPojo> listOfPojos = new ArrayList<MyPojo>();
        Set<MyPojo> setOfPojos = new HashSet<MyPojo>();

        for (int countA = 0; countA < 100; countA++) {
            for (int countB = 0; countB < 100; countB++) {
                for (int countC = 0; countC < 100; countC++) {
                    MyPojo myPojo = new MyPojo(countA, countB, countC);
                    listOfPojos.add(myPojo);
                    setOfPojos.add(myPojo);
                }
            }
        }

        long startTime = System.currentTimeMillis();

        for (int count = 0; count < 10; count++) {
            for (MyPojo myPojo : listOfPojos) {
                if (!setOfPojos.contains(myPojo)) {
                    throw new RuntimeException();
                }
            }
        }

        long endTime = System.currentTimeMillis();
        System.out.format("Execution time: %3f s", (endTime - startTime) / 1000.0);

    }

    // Generated by Eclipse
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + a;
        result = prime * result + b;
        result = prime * result + c;
        return result;
    }

    // Generated by Eclipse
    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        MyPojo other = (MyPojo) obj;
        if (a != other.a)
            return false;
        if (b != other.b)
            return false;
        if (c != other.c)
            return false;
        return true;
    }

}

在我的机器上，这导致执行时间约为1.23秒。

现在用我在别处找到的那个替换hashCode（）方法：

@Override
public int hashCode() {
    final int magic = 0x9e3779b9;
    int seed = 0;
    seed ^= this.a + magic + (seed << 6) + (seed >> 2);
    seed ^= this.b + magic + (seed << 6) + (seed >> 2);
    seed ^= this.c + magic + (seed << 6) + (seed >> 2);
    return seed;
}

现在执行时间仅为0.2秒，大约快6倍！

为什么？

编辑：

正如所建议的那样，计算哈希值的“重新发生次数”数量如下：

使用Eclipse生成的hashCode（）方法：

使用手动编码的hashCode（）方法：

因此，Eclipse生成的方法只提供了62次只出现一次的哈希码。

手动编码版本提供了仅发生一次的79093个哈希码，以及仅发生两次的180316个哈希码。

差异很大。

编辑2：

还尝试了Objects.hash（...），与Eclipse生成的hashCode（）方法相比，这给出了相同的“重新并发”计数。

@Override
public int hashCode() {
    return Objects.hash(a, b, c);
}

此外，这实际上减慢了执行速度：1.38秒

编辑3：

以下是对上面更好的哈希码方法中“神奇数字”的来源的解释：

Magic number in boost::hash_combine

编辑4：使用http://projectlombok.org生成hashCode（）和equals（）方法

龙目岛取得了最好的成绩：

1: 33958
2: 146124
3: 8118
4: 162360

Execution time: 0.187000 s

Answer 1

Eclipse hashCode()遵循Effective Java中建议的指南。作者说这种方法相当不错，但绝对不是最好的方法。

如果hashCode表现不理想，您可以自由选择替代方案。

我还要提到的另一件事是，对于几乎每个hashCode函数，您可能能够找到一些阻止函数均匀分布散列值的数据集，从而使{{{代码中的1}}就像一个很长的列表。

您可以在此处查看其他讨论：Is the hashCode function generated by Eclipse any good?

您可能还想阅读this

为什么Eclipse生成的hashCode（）方法返回的哈希码不是很好？

1 个答案: