在我的程序中,我通过Thrift和RDFDataMgr使用了Jena(2.13.0)DatasetGraphs的大量序列化和反序列化,但在某个时刻我得到了OutOfMemory异常。有人可以帮助我找到问题吗?
OutOfMemoryError: GC overhead limit exceeded
at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
at org.apache.jena.riot.thrift.TRDF.protocol(TRDF.java:72)
at org.apache.jena.riot.thrift.StreamRDF2Thrift.<init>(StreamRDF2Thrift.java:55)
at org.apache.jena.riot.thrift.BinRDF.streamToOutputStream(BinRDF.java:103)
at org.apache.jena.riot.thrift.WriterDatasetThrift.write(WriterDatasetThrift.java:53)
at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1331)
at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1205)
at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1195)
和
java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.thrift.protocol.TCompactProtocol.readFieldBegin(TCompactProtocol.java:558)
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:222)
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213)
at org.apache.thrift.TUnion.read(TUnion.java:138)
at org.apache.jena.riot.thrift.wire.RDF_Quad$RDF_QuadStandardScheme.read(RDF_Quad.java:582)
at org.apache.jena.riot.thrift.wire.RDF_Quad$RDF_QuadStandardScheme.read(RDF_Quad.java:549)
at org.apache.jena.riot.thrift.wire.RDF_Quad.read(RDF_Quad.java:464)
at org.apache.jena.riot.thrift.wire.RDF_StreamRow.standardSchemeReadValue(RDF_StreamRow.java:203)
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:224)
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:213)
at org.apache.thrift.TUnion.read(TUnion.java:138)
at org.apache.jena.riot.thrift.BinRDF.apply(BinRDF.java:187)
at org.apache.jena.riot.thrift.BinRDF.applyVisitor(BinRDF.java:176)
at org.apache.jena.riot.thrift.BinRDF.protocolToStream(BinRDF.java:164)
at org.apache.jena.riot.thrift.BinRDF.inputStreamToStream(BinRDF.java:149)
at org.apache.jena.riot.RDFParserRegistry$ReaderRDFThrift.read(RDFParserRegistry.java:221)
at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:906)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:577)
at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:554)
答案 0 :(得分:-1)
实际上我正在使用Spark和Flink来运行复杂的mapreduce作业,并且我使用数据集格式的Thrift序列化序列化了许多四边形组。我最常用的方法是:
public static void ser(DatasetGraph dsg, byte[] b, Lang l) {
InputStream is = new ByteArrayInputStream(b);
RDFDataMgr.read(dsg, is, l);
closeStream(is);
dsg.close();
}
和
public static DatasetGraph deser(byte[] b, Lang l) {
DatasetGraph ret = DatasetGraphFactory.createMem();
InputStream is = new ByteArrayInputStream(b);
RDFDataMgr.read(ret, is, l);
closeStream(is);
return ret;
}