最近,我遇到了一段代码,其中使用了Map<Integer, String>
,其中Integer
(键)是hashCode
的某个字符串和String
值对应的那个
这是正确的事吗?现在,在调用equals
时,String
不会调用get
。 (get
也是在String对象上使用hashCode()
方法完成的。
或者,hashCode(s)对于唯一的字符串是唯一的吗?
我查看了equals
od String
课程。有为此写的逻辑。我很困惑。
答案 0 :(得分:9)
HashMap 使用equals()
来比较密钥。它仅使用hashCode()
来查找密钥所在的存储区,从而大幅减少与equals()
进行比较的密钥数。
显然,hashCode()
不能生成唯一值,因为int限制为2 ^ 32个不同的值,并且存在无限可能的String值。
总之,hashCode()
的结果不适合Map
的密钥。
答案 1 :(得分:4)
不,它们并不是唯一的,例如&#34; FB&#34;和&#34; Ea&#34;两者都有hashCode = 2236
答案 2 :(得分:1)
这是不正确的,因为hashCode()
会发生碰撞。由于hashCode()
定义不明确,因此任何字符串对都可能发生冲突。所以你不能使用它。如果需要指向字符串的唯一指针,则可以使用加密哈希作为键:
以下代码显示了如何执行此操作:
/**
* Immutable class that represents a unique key for a string. This unique key
* can be used as a key in a hash map, without the likelihood of a collision:
* generating the same key for a different String. Note that you should first
* check if keeping a key to a reference of a string is feasible, in that case a
* {@link Set} may suffice.
* <P>
* This class utilizes SHA-512 to generate the keys and uses
* {@link StandardCharsets#UTF_8} for the encoding of the strings. If a smaller
* output size than 512 bits (64 bytes) is required then the leftmost bytes of
* the SHA-512 hash are used. Smaller keys are therefore contained in with
* larger keys over the same String value.
* <P>
* Note that it is not impossible to create collisions for key sizes up to 8-20
* bytes.
*
* @author owlstead
*/
public final class UniqueKey implements Serializable {
public static final int MIN_DIGEST_SIZE_BYTES = 8;
public static final int MAX_DIGEST_SIZE_BYTES = 64;
/**
* Creates a unique key for a string with the maximum size of 64 bytes.
*
* @param input
* the input, not null
* @return the generated instance
*/
public static UniqueKey createUniqueKey(final CharSequence input) {
return doCreateUniqueKey(input, MAX_DIGEST_SIZE_BYTES);
}
/**
* Creates a unique key for a string with a size of 8 to 64 bytes.
*
* @param input
* the input, not null
* @param outputSizeBytes
* the output size
* @return the generated instance
*/
public static UniqueKey createUniqueKey(final CharSequence input,
final int outputSizeBytes) {
return doCreateUniqueKey(input, outputSizeBytes);
}
@Override
public boolean equals(final Object obj) {
if (!(obj instanceof UniqueKey)) {
return false;
}
final UniqueKey that = (UniqueKey) obj;
return ByteBuffer.wrap(this.key).equals(ByteBuffer.wrap(that.key));
}
@Override
public int hashCode() {
return ByteBuffer.wrap(this.key).hashCode();
}
/**
* Outputs an - in itself - unique String representation of this key.
*
* @return the string <CODE>"{key: [HEX ENCODED KEY]}"</CODE>
*/
@Override
public String toString() {
// non-optimal but readable conversion to hexadecimal
final StringBuilder sb = new StringBuilder(this.key.length * 2);
sb.append("{Key: ");
for (int i = 0; i < this.key.length; i++) {
sb.append(String.format("%02X", this.key[i]));
}
sb.append("}");
return sb.toString();
}
/**
* Makes it possible to retrieve the underlying key data (e.g. to use a
* different encoding).
*
* @return the data in a read only ByteBuffer
*/
public ByteBuffer asReadOnlyByteBuffer() {
return ByteBuffer.wrap(this.key).asReadOnlyBuffer();
}
private static final long serialVersionUID = 1L;
private static final int BUFFER_SIZE = 512;
// byte array instead of ByteBuffer to support serialization
private final byte[] key;
private static UniqueKey doCreateUniqueKey(final CharSequence input,
final int outputSizeBytes) {
// --- setup digest
final MessageDigest digestAlgorithm;
try {
// note: relatively fast on 64 bit systems (faster than SHA-256!)
digestAlgorithm = MessageDigest.getInstance("SHA-512");
} catch (final NoSuchAlgorithmException e) {
throw new IllegalStateException(
"SHA-256 should always be avialable in a Java RE");
}
// --- validate input parameters
if (outputSizeBytes < MIN_DIGEST_SIZE_BYTES
|| outputSizeBytes > MAX_DIGEST_SIZE_BYTES) {
throw new IllegalArgumentException(
"Unique key size either too small or too big");
}
// --- setup loop
final CharsetEncoder encoder = StandardCharsets.UTF_8.newEncoder();
final CharBuffer buffer = CharBuffer.wrap(input);
final ByteBuffer encodedBuffer = ByteBuffer.allocate(BUFFER_SIZE);
CoderResult coderResult;
// --- loop over all characters
// (instead of encoding everything to byte[] at once - peak memory!)
while (buffer.hasRemaining()) {
coderResult = encoder.encode(buffer, encodedBuffer, false);
if (coderResult.isError()) {
throw new IllegalArgumentException(
"Invalid code point in input string");
}
encodedBuffer.flip();
digestAlgorithm.update(encodedBuffer);
encodedBuffer.clear();
}
coderResult = encoder.encode(buffer, encodedBuffer, true);
if (coderResult.isError()) {
throw new IllegalArgumentException(
"Invalid code point in input string");
}
encodedBuffer.flip();
digestAlgorithm.update(encodedBuffer);
// no need to clear encodedBuffer if generated locally
// --- resize result if required
final byte[] digest = digestAlgorithm.digest();
final byte[] result;
if (outputSizeBytes == digest.length) {
result = digest;
} else {
result = Arrays.copyOf(digest, outputSizeBytes);
}
// --- and return the final, possibly resized, result
return new UniqueKey(result);
}
private UniqueKey(final byte[] key) {
this.key = key;
}
}
答案 3 :(得分:0)
通过对键String进行自己的散列,该代码冒着两个不同键字符串生成相同整数映射键的可能性,并且代码在某些情况下会失败。
通常,代码应该使用Map<String,String>
。
然而,作者可能是出于好(即故意)理由这样做而且不是错误。没有看到代码,我们无法分辨。
答案 4 :(得分:0)
hashCode用于提高集合的性能。例如,使用包含唯一项的Set。添加新项目时,hashCode将用于检测项目中是否已存在该项目。
如果发生碰撞,我们将陷入(通常)较慢的等于比较。
因此,对于hashCode应返回不同值的性能非常重要。更重要的是,在给定相同的内部状态的情况下,对象的hashCode在调用之间是一致的。
因此,使用hashCode作为地图中的键是不明智的!它们并不是唯一的。