我正在尝试将我的批量插入到现有数据库中,但我得到以下异常:
线程“GC-Monitor”中的异常java.lang.OutOfMemoryError:Java堆 java.util.Arrays.copyOf(Arrays.java:2245)中的空格 java.util.Arrays.copyOf(Arrays.java:2219)at java.util.ArrayList.grow(ArrayList.java:242)at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:216)at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:208)at java.util.ArrayList.add(ArrayList.java:440)at java.util.Formatter.parse(Formatter.java:2525)at java.util.Formatter.format(Formatter.java:2469)at java.util.Formatter.format(Formatter.java:2423)at java.lang.String.format(String.java:2792)at org.neo4j.kernel.impl.cache.MeasureDoNothing.run(MeasureDoNothing.java:64) 失败:交易被标记为成功,但无法提交 交易如此回滚。
以下是我的插入代码的结构:
public void parseExecutionRecordFile(Node episodeVersionNode, String filePath, Integer insertionBatchSize) throws Exception {
Gson gson = new Gson();
BufferedReader reader = new BufferedReader(new FileReader(filePath));
String aDataRow = "";
List<ExecutionRecord> executionRecords = new LinkedList<>();
Integer numberOfProcessedExecutionRecords = 0;
Integer insertionCounter = 0;
ExecutionRecord lastProcessedExecutionRecord = null;
Node lastProcessedExecutionRecordNode = null;
Long start = System.nanoTime();
while((aDataRow = reader.readLine()) != null) {
JsonReader jsonReader = new JsonReader(new StringReader(aDataRow));
jsonReader.setLenient(true);
ExecutionRecord executionRecord = gson.fromJson(jsonReader, ExecutionRecord.class);
executionRecords.add(executionRecord);
insertionCounter++;
if(insertionCounter == insertionBatchSize || executionRecord.getType() == ExecutionRecord.Type.END_MESSAGE) {
lastProcessedExecutionRecordNode = appendEpisodeData(episodeVersionNode, lastProcessedExecutionRecordNode, executionRecords, lastProcessedExecutionRecord == null ? null : lastProcessedExecutionRecord.getTraceSequenceNumber());
executionRecords = new LinkedList<>();
lastProcessedExecutionRecord = executionRecord;
numberOfProcessedExecutionRecords += insertionCounter;
insertionCounter = 0;
}
}
}
public Node appendEpisodeData(Node episodeVersionNode, Node previousExecutionRecordNode, List<ExecutionRecord> executionRecordList, Integer traceCounter) {
Iterator<ExecutionRecord> executionRecordIterator = executionRecordList.iterator();
Node previousTraceNode = null;
Node currentTraceNode = null;
Node currentExecutionRecordNode = null;
try (Transaction tx = dbInstance.beginTx()) {
// some graph insertion
tx.success();
return currentExecutionRecordNode;
}
}
所以基本上,我从一个文件(大约20,000个对象)中读取json对象,并将其每10,000条记录插入到neo4j中。如果我在文件中只有10,000个JSON对象,那么它可以正常工作。但是,当我有20,000,它会抛出异常。
提前致谢,我们非常感谢任何帮助!
答案 0 :(得分:2)
如果使用10000个对象,只需尝试至少复制堆内存。 请查看以下网站:http://neo4j.com/docs/stable/server-performance.html
wrapper.java.maxmemory 选项可以解决您的问题。
答案 1 :(得分:1)
当您还插入几个k属性时,tx状态将保留在内存中。所以我认为10k批量大小对于那个堆的数量来说很好。
您也不会关闭JSON阅读器,因此它可能会留在StringReader
里面。
您还应该使用以批量大小初始化的ArrayList
并使用list.clear()
代替娱乐/重新分配。