Neo4j Embedded 2.2.1:线程“GC-Monitor”中的异常java.lang.OutOfMemoryError:Java堆空间

时间:2015-08-06 16:48:04

标签: neo4j out-of-memory neo4j-embedded

我正在尝试将我的批量插入到现有数据库中,但我得到以下异常:

  

线程“GC-Monitor”中的异常java.lang.OutOfMemoryError:Java堆   java.util.Arrays.copyOf(Arrays.java:2245)中的空格   java.util.Arrays.copyOf(Arrays.java:2219)at   java.util.ArrayList.grow(ArrayList.java:242)at   java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:216)at   java.util.ArrayList.ensureCapacityInternal(ArrayList.java:208)at   java.util.ArrayList.add(ArrayList.java:440)at   java.util.Formatter.parse(Formatter.java:2525)at   java.util.Formatter.format(Formatter.java:2469)at   java.util.Formatter.format(Formatter.java:2423)at   java.lang.String.format(String.java:2792)at   org.neo4j.kernel.impl.cache.MeasureDoNothing.run(MeasureDoNothing.java:64)   失败:交易被标记为成功,但无法提交   交易如此回滚。

以下是我的插入代码的结构:

public void parseExecutionRecordFile(Node episodeVersionNode, String filePath, Integer insertionBatchSize) throws Exception {
        Gson gson = new Gson();
        BufferedReader reader = new BufferedReader(new FileReader(filePath));
        String aDataRow = "";
        List<ExecutionRecord> executionRecords = new LinkedList<>();

        Integer numberOfProcessedExecutionRecords = 0;
        Integer insertionCounter = 0;
        ExecutionRecord lastProcessedExecutionRecord = null;
        Node lastProcessedExecutionRecordNode = null;

        Long start = System.nanoTime();
        while((aDataRow = reader.readLine()) != null) {
            JsonReader jsonReader = new JsonReader(new StringReader(aDataRow));
            jsonReader.setLenient(true);
            ExecutionRecord executionRecord = gson.fromJson(jsonReader, ExecutionRecord.class);
            executionRecords.add(executionRecord);

            insertionCounter++;

            if(insertionCounter == insertionBatchSize || executionRecord.getType() == ExecutionRecord.Type.END_MESSAGE) {
                lastProcessedExecutionRecordNode = appendEpisodeData(episodeVersionNode, lastProcessedExecutionRecordNode, executionRecords, lastProcessedExecutionRecord == null ? null : lastProcessedExecutionRecord.getTraceSequenceNumber());
                executionRecords = new LinkedList<>();
                lastProcessedExecutionRecord = executionRecord;
                numberOfProcessedExecutionRecords += insertionCounter;
                insertionCounter = 0;
            }
        }
    }

public Node appendEpisodeData(Node episodeVersionNode, Node previousExecutionRecordNode, List<ExecutionRecord> executionRecordList, Integer traceCounter) {
        Iterator<ExecutionRecord> executionRecordIterator = executionRecordList.iterator();

        Node previousTraceNode = null;
        Node currentTraceNode = null;
        Node currentExecutionRecordNode = null;

        try (Transaction tx = dbInstance.beginTx()) {
            // some graph insertion

            tx.success();
            return currentExecutionRecordNode;
        }
    }

所以基本上,我从一个文件(大约20,000个对象)中读取json对象,并将其每10,000条记录插入到neo4j中。如果我在文件中只有10,000个JSON对象,那么它可以正常工作。但是,当我有20,000,它会抛出异常。

提前致谢,我们非常感谢任何帮助!

2 个答案:

答案 0 :(得分:2)

如果使用10000个对象,只需尝试至少复制堆内存。 请查看以下网站:http://neo4j.com/docs/stable/server-performance.html

wrapper.java.maxmemory 选项可以解决您的问题。

答案 1 :(得分:1)

当您还插入几个k属性时,tx状态将保留在内存中。所以我认为10k批量大小对于那个堆的数量来说很好。

您也不会关闭JSON阅读器,因此它可能会留在StringReader里面。

您还应该使用以批量大小初始化的ArrayList并使用list.clear()代替娱乐/重新分配。