哈希映射的序列化往返不保留顺序

时间:2014-03-13 22:22:43

标签: java serialization

我注意到,使用最新版本的Java(1.7.0_u51),散列映射的序列化和反序列化不再保留散列映射中元素的顺序。请参阅以下示例:

@Test
public void test() throws IOException, ClassNotFoundException {
    HashMap<String, String> map1 = new HashMap<>();
    map1.put("a1234567", "aaa");
    map1.put("b1234567", "bbb");

    System.out.println("Map1: " + map1.toString());

    byte[] serializedMap1 = objectToBytes(map1);

    System.out.println("Map1 Serialized: " + Arrays.toString(serializedMap1));

    Object map2 = bytesToObject(serializedMap1);

    System.out.println("Map2: " + map2.toString());

    byte[] serializedMap2 = objectToBytes((Serializable) map2);

    System.out.println("Map2 Serialized: " + Arrays.toString(serializedMap2));

    Object map3 = bytesToObject(serializedMap2);

    System.out.println("Map3: " + map3.toString());

    byte[] serializedMap3 = objectToBytes((Serializable) map3);

    System.out.println("Map3 Serialized: " + Arrays.toString(serializedMap3));

    Object map4 = bytesToObject(serializedMap3);

    System.out.println("Map4: " + map4.toString());

    byte[] serializedMap4 = objectToBytes((Serializable) map4);

    System.out.println("Map4 Serialized: " + Arrays.toString(serializedMap4));
}

private byte[] objectToBytes(Serializable obj) throws IOException {
    PoolByteArrayOutputStream bos = new PoolByteArrayOutputStream();
    try {
        ObjectOutputStream oos = new ObjectOutputStream(bos);
        oos.writeObject(obj);
        byte[] bytes = bos.toByteArray();
        oos.close();
        return bytes;
    } finally {
        bos.close();
    }
}

private Object bytesToObject(byte[] str) throws IOException, ClassNotFoundException {
    ByteArrayInputStream bis = new ByteArrayInputStream(str);
    ObjectInputStream ois = new ClassLoaderObjectInputStream(bis, null);

    Object obj = ois.readObject();
    ois.close();
    bis.close();
    return obj;
}

以上测试将输出:

Map1: {a1234567=aaa, b1234567=bbb}
Map1 Serialized: [-84, -19, 0, 5, 115, 114, 0, 17, 106, 97, 118, 97, 46, 117, 116, 105, 108, 46, 72, 97, 115, 104, 77, 97, 112, 5, 7, -38, -63, -61, 22, 96, -47, 3, 0, 2, 70, 0, 10, 108, 111, 97, 100, 70, 97, 99, 116, 111, 114, 73, 0, 9, 116, 104, 114, 101, 115, 104, 111, 108, 100, 120, 112, 63, 64, 0, 0, 0, 0, 0, 12, 119, 8, 0, 0, 0, 16, 0, 0, 0, 2, 116, 0, 8, 97, 49, 50, 51, 52, 53, 54, 55, 116, 0, 3, 97, 97, 97, 116, 0, 8, 98, 49, 50, 51, 52, 53, 54, 55, 116, 0, 3, 98, 98, 98, 120]
Map2: {b1234567=bbb, a1234567=aaa}
Map2 Serialized: [-84, -19, 0, 5, 115, 114, 0, 17, 106, 97, 118, 97, 46, 117, 116, 105, 108, 46, 72, 97, 115, 104, 77, 97, 112, 5, 7, -38, -63, -61, 22, 96, -47, 3, 0, 2, 70, 0, 10, 108, 111, 97, 100, 70, 97, 99, 116, 111, 114, 73, 0, 9, 116, 104, 114, 101, 115, 104, 111, 108, 100, 120, 112, 63, 64, 0, 0, 0, 0, 0, 1, 119, 8, 0, 0, 0, 2, 0, 0, 0, 2, 116, 0, 8, 98, 49, 50, 51, 52, 53, 54, 55, 116, 0, 3, 98, 98, 98, 116, 0, 8, 97, 49, 50, 51, 52, 53, 54, 55, 116, 0, 3, 97, 97, 97, 120]
Map3: {a1234567=aaa, b1234567=bbb}
Map3 Serialized: [-84, -19, 0, 5, 115, 114, 0, 17, 106, 97, 118, 97, 46, 117, 116, 105, 108, 46, 72, 97, 115, 104, 77, 97, 112, 5, 7, -38, -63, -61, 22, 96, -47, 3, 0, 2, 70, 0, 10, 108, 111, 97, 100, 70, 97, 99, 116, 111, 114, 73, 0, 9, 116, 104, 114, 101, 115, 104, 111, 108, 100, 120, 112, 63, 64, 0, 0, 0, 0, 0, 1, 119, 8, 0, 0, 0, 2, 0, 0, 0, 2, 116, 0, 8, 97, 49, 50, 51, 52, 53, 54, 55, 116, 0, 3, 97, 97, 97, 116, 0, 8, 98, 49, 50, 51, 52, 53, 54, 55, 116, 0, 3, 98, 98, 98, 120]
Map4: {b1234567=bbb, a1234567=aaa}
Map4 Serialized: [-84, -19, 0, 5, 115, 114, 0, 17, 106, 97, 118, 97, 46, 117, 116, 105, 108, 46, 72, 97, 115, 104, 77, 97, 112, 5, 7, -38, -63, -61, 22, 96, -47, 3, 0, 2, 70, 0, 10, 108, 111, 97, 100, 70, 97, 99, 116, 111, 114, 73, 0, 9, 116, 104, 114, 101, 115, 104, 111, 108, 100, 120, 112, 63, 64, 0, 0, 0, 0, 0, 1, 119, 8, 0, 0, 0, 2, 0, 0, 0, 2, 116, 0, 8, 98, 49, 50, 51, 52, 53, 54, 55, 116, 0, 3, 98, 98, 98, 116, 0, 8, 97, 49, 50, 51, 52, 53, 54, 55, 116, 0, 3, 97, 97, 97, 120]

(注意这只适用于最后7个字符相等的地图键)

从上面的输出中可以看到每次序列化往返后订单会继续变化。

我明白地图的内部顺序并不保证一致,我不依赖它,但我会假设在序列化往返之后,当地图本身没有改变时,序列化字节将是相同的。

JDK中发生了哪些具体变化导致这种情况发生? (这是JDK中的错误吗?)

有没有办法一致地为同一个hashmap获取相同的序列化字节? (不使用不同的保留地图的顺序)

3 个答案:

答案 0 :(得分:5)

HashMap没有任何可预测的顺序。因此,如果序列化改变它恰好具有的顺序,那么它就不是问题。请注意,在地图中进行任何更改(添加,删除)也会更改其顺序。

如果广告订单很重要,那么您应该使用LinkedHashMap

答案 1 :(得分:3)

HashMaps明确记录为无序。如果您依赖他们的订购,那么您已经做错了。

答案 2 :(得分:2)

  

我希望能够获得一致的序列化数据。

如果您需要,那么您将需要使用不同的数据结构。 HashMap类不提供这些保证。

在任何简单的哈希表中,条目的观察顺序取决于:

  • 表的大小,

  • 添加和删除元素的顺序,以及

  • hashcode()函数返回的实际值。

如果您根据哈希表编写自定义Map,那么在理论上可以控制前两个序列化/反序列化。但最后一个是你无法控制的。因此,如果您的某个密钥(例如)具有依赖于身份哈希码的哈希代码,那么无论您如何序列化/反序列化,都无法保留迭代顺序。

在您的情况下,您似乎正在序列化/反序列化HashMap<String, String>。这是一种在Java版本中理论上可以进行订单保存的情况。 (用于散列Java String的算法是指定的 ...)但是,我无法看到如何使用HashMap来实现它... ...对私有内部数据结构进行分类。

简而言之,如果您需要在序列化/反序列化中保留元素的顺序,请使用LinkedHashMapTreeMap