我在处理之前提出的question时遇到了这个问题。
这可能是ObjectInputStream
特有的,而不是二进制阅读,所以标题可能会产生误导。
基本上问题就是这样:作者已经将字符串的哈希映射序列化为双精度数。对于散列映射中的每个条目,作者的custom serialization format非常简单
int n // length of string key as a 4-byte integer
byte[n] key // a string of length n
double value // the value associated with the key
现在由于某种原因,在序列化过程中,其中一个字符串2010-00-008.html
被序列化了两个额外的字节,如下所示。
因此,不是写入16个字节,而是写入18个字节。这肯定会引起问题,因为它仍然表示字符串长度为16个字节。
但是,出于某种原因,您可以编写哈希映射并完美地读回来!似乎给了一个18字节的字符串,你可以读取16个字节,仍然可以读取整个内容。
这是代码。它基本上是另一个问题中的代码,除了我做的这样你应该能够只改变路径并运行它。运行它之后,您将获得一系列write语句,后跟一系列read语句。检查文件,你应该注意到字符串中的额外字节,但程序不会崩溃。
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.util.HashMap;
import java.util.Map;
public class Test {
// customize the path as needed
public static String path = "C:\\temp\\sample.dat";
HashMap<String, Double> map = new HashMap<String, Double>();
public Test() {
map.put("2010-00-027.html",21732.994621513037); map.put("2010-00-020.html",3466.5169348296736); map.put("2010-00-051.html",12528.648992702407); map.put("2010-00-062.html",3354.8950010256385);
map.put("2010-00-024.html",10295.095511718278); map.put("2010-00-052.html",5381.513344679818); map.put("2010-00-007.html",16466.33813960735); map.put("2010-00-017.html",9484.969198176652);
map.put("2010-00-054.html",15423.873112634772); map.put("2010-00-022.html",8123.842752870753); map.put("2010-00-033.html",21238.496665104063); map.put("2010-00-028.html",7578.792651786424);
map.put("2010-00-048.html",3566.4118233046393); map.put("2010-00-040.html",2681.0799941861724); map.put("2010-00-049.html",14308.090890746222); map.put("2010-00-058.html",5911.342406606804);
map.put("2010-00-045.html",2284.118716145881); map.put("2010-00-031.html",2859.565771680721); map.put("2010-00-046.html",4555.187022907964); map.put("2010-00-036.html",8479.709295569426);
map.put("2010-00-061.html",846.8292195815125); map.put("2010-00-023.html",14108.644025417952); map.put("2010-00-041.html",22686.232732684934); map.put("2010-00-025.html",9513.539663409734);
map.put("2010-00-012.html",459.6427911376829); map.put("2010-00-005.html",0.0); map.put("2010-00-013.html",2646.403220496738); map.put("2010-00-065.html",5808.86423609936);
map.put("2010-00-056.html",12154.250518054876); map.put("2010-00-008.html",10811.15198506469); map.put("2010-00-042.html",9271.006516004005); map.put("2010-00-000.html",4387.4162586468965);
map.put("2010-00-059.html",4456.211623469774); map.put("2010-00-055.html",3534.7511584735325); map.put("2010-00-057.html",8745.640098512009); map.put("2010-00-032.html",4993.295735075575);
map.put("2010-00-021.html",3852.5805998017922); map.put("2010-00-043.html",4108.020033536286); map.put("2010-00-053.html",2.2446400279239946); map.put("2010-00-030.html",17853.541210836203);
}
public void write() {
try {
ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream(path));
oos.writeInt(map.size()); // write size of the map
for (Map.Entry<String, Double> entry : map.entrySet()) { // iterate entries
System.out.println("writing ("+ entry.getKey() +","+ entry.getValue() +")");
byte[] bytes = entry.getKey().getBytes();
oos.writeInt(bytes.length); // length of key string
oos.write(bytes); // key string bytes
oos.writeDouble(entry.getValue()); // value
}
oos.close();
} catch (Exception e) {
}
}
public void read() {
try {
FileInputStream f = new FileInputStream(path);
ObjectInputStream ois = new ObjectInputStream(f);
int size = ois.readInt(); // read size of the map
HashMap<String, Double> newMap = new HashMap<>(size);
for (int i = 0; i < size; i++) { // iterate entries
int length = ois.readInt(); // length of key string
byte[] bytes = new byte[length];
ois.readFully(bytes, 0, length);
//ois.read(bytes);
String key = new String(bytes);
double value = ois.readDouble(); // value
newMap.put(key, value);
System.out.println("read ("+ key +","+ value +")");
}
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
Test t = new Test();
t.write();
t.read();
}
}
答案 0 :(得分:4)
您需要阅读Protocol chapter of the Object Serialization Specification。除实际数据外,该流还包含类型和块标记。这是其中之一,当正确读取流时,它会被ObjectInputStream
过滤掉。
编辑额外字节为77 64
,表示大小为TC_BLOCK_DATA
的{{1}}