Question

在继续我的问题之前：请注意我没有在任何需要序列化的客户端 - 服务器应用程序上工作，但我尝试自定义的程序在.dat文件中存储一个大类的一个大实例。我已经读过这个问题（ObjectOutputStream和ObjectInputStream中的内存泄漏）以及我可能需要的事实：

在.dat文件中编写类实例后使用ObjectOutputStream.reset（）方法，以便它不再保留引用;
不使用序列化重写代码;
拆分文件并以块状读取;
使用-Xmx;

因此，我获得了一个生成语言模型的类，并使用.dat扩展名保存它;代码可能针对小型模型文件进行了优化（作为示例提供了2个模型文件，大约10MB），但是我生成了一个更大的模型类，它大约是40MB。然后，在另一个文件夹中有另一个类，完全独立于第一个文件夹，使用此模型，并且必须使用ObjectInputStream加载模型。问题出现了：经典的“OutOfMemoryError：Java堆空间”。

写对象：

try {
  // Create an output stream to the file.
  FileOutputStream file_output = new FileOutputStream (file);
  ObjectOutputStream o = new ObjectOutputStream( file_output ); 
  o.writeObject(this); 
  file_output.close ();
} 
catch (IOException e) {
   System.err.println ("IO exception = " + e );
}

阅读对象：

InputStream model = null;
ModelGeneration oRead = null;
ObjectInputStream p = null; 

try {
  model = new FileInputStream(filename);
  BufferedInputStream buf = new BufferedInputStream(model);
  p = new ObjectInputStream(buf);
  oRead = (ModelGeneration) p.readObject();
  p.reset();

} catch (IOException e) {
  e.printStackTrace();
} catch (ClassNotFoundException e) {
  e.printStackTrace();
} finally {
  try {
    model.close();
  } catch (Exception e) {
    e.printStackTrace();
  }
}

我尝试使用reset（）方法，但它没用，因为我们一次只加载一个类的一个实例，不需要其他任何东西。这就是我无法拆分文件的原因：只有一个类实例存储在.dat文件中。

更改堆空间似乎比优化代码更糟糕。

我非常感谢你对我能做什么的建议。

顺便说一下代码在这里：http://svn.apache.org/repos/asf/uima/addons/trunk/Tagger/，我只为不同的语言实现了所需的类。

P.S。如果我创建一个较小的模型，工作正常，但我更喜欢更大的模型。

ObjectInputStream - 读取大型二进制文件 - 内存问题

0 个答案: