插入大数据时,Neo4j引发java.lang.OutOfMemory异常

时间:2014-02-13 03:41:56

标签: transactions neo4j out-of-memory

大家

我的程序在插入大数据时引发java.lang.OutOfMemory异常,通过,我使用了一些调整技巧,作为更改java_opts和事务批处理commit.I听说JVM将减少内存使用量,因为Neo4J提交了它的事务。但是它似乎不起作用。 当它处理7,000,000行Exception时,有什么建议?

这是我的Neo4j属性

neostore.propertystore.db.index.keys.mapped_memory=20M
neostore.propertystore.db.index.mapped_memory=20M
neostore.nodestore.db.mapped_memory=400M
neostore.relationshipstore.db.mapped_memory=1000M
neostore.propertystore.db.mapped_memory=400M
neostore.propertystore.db.strings.mapped_memory=400M

这是我的JVM OPTS

 java -jar -server -Xmx2G -XX:+UseConcMarkSweepGC neodataio.jar $@

这是我的代码

public Node createNode(String type, String v) {
stype = type;
UniqueFactory.UniqueNodeFactory factory = new UniqueFactory.UniqueNodeFactory(
    db, type) {
    @Override
    protected void initialize(Node created,
        Map<String, Object> properties) {
    created.addLabel(DynamicLabel.label(stype));
    created.setProperty("v", properties.get(stype));
    }

};
return factory.getOrCreate(type, v);
}

private void processLine(String line) {
line = stripeStr(line);
String[] fields = line.split("["+splitor+"]");
List<Node> row = new ArrayList<Node>();
Map<String,Boolean> unqi = new HashMap<String,Boolean>();
for (String field : fields) {
    String[] kvs = field.split("["+kv+"]");
    if(kvs.length==2
        &&!unqi.containsKey(kvs[1])
        &&!stripeStr(kvs[1]).equals("")
        &&!stripeStr(kvs[1]).toLowerCase().equals("null")){
    Node n = createNode(stripeStr(kvs[0]), stripeStr(kvs[1]));
    row.add(n);
    unqi.put(kvs[1], true);
    }
}
if (row.size() > 1) {
    for (int i = 1; i < row.size(); i++) {
    row.get(0).createRelationshipTo(row.get(i), Importer.connect);
    }
}
}

private void processBatch(ArrayList<String> batch){
Transaction tx = db.beginTx();
try {
    for(String line : batch) {        
        processLine(line);        
    }    
    tx.success();
} finally {
    tx.close();
}
}

private String stripeStr(String str){
return str.trim().replace("\n", "").replace("\t", "");
}

public void processFile(String filepth) throws IOException {
long begin = new Date().getTime();
File f = new File(filepth);
FileInputStream fi = new FileInputStream(f);
BufferedReader dr=new BufferedReader(new InputStreamReader(fi)); 
String line;
int i = 1;
ArrayList<String> batch = new ArrayList<String>();
while((line=dr.readLine())!=null){
    batch.add(line);
    if(i%batchsize == 0){
    processBatch(batch);
    batch = new ArrayList<String>();
    System.out.println(i);
    }
    i++;
}
processBatch(batch);
System.out.println(i);
long end = new Date().getTime();
System.out.println("cost time:"+(end-begin));
}

异常

 Exception in thread "GC-Monitor" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOfRange(Arrays.java:2694)
    at java.lang.String.<init>(String.java:203)
    at java.lang.StringBuilder.toString(StringBuilder.java:405)
    at org.neo4j.kernel.impl.cache.MeasureDoNothing.run(MeasureDoNothing.java:84)
Exception in thread "main" org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:140)
    at com.bfd.finance.neo4j.dataio.Importer.processBatch(Importer.java:79)
    at com.bfd.finance.neo4j.dataio.Importer.processFile(Importer.java:98)
    at com.bfd.finance.neo4j.dataio.Importer.main(Importer.java:161)
Caused by: org.neo4j.graphdb.TransactionFailureException: commit threw exception
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:498)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:397)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:122)
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124)
    ... 3 more
Caused by: javax.transaction.xa.XAException
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:553)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:460)
    ... 6 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.util.HashMap.createEntry(HashMap.java:901)
    at java.util.HashMap.putForCreate(HashMap.java:554)
    at java.util.HashMap.putAllForCreate(HashMap.java:559)
    at java.util.HashMap.<init>(HashMap.java:298)
    at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.applyCommit(WriteTransaction.java:817)
    at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.doCommit(WriteTransaction.java:751)
    at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:322)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commitWriteTx(XaResourceManager.java:530)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:446)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:545)
    ... 7 more

1 个答案:

答案 0 :(得分:1)

我们所做的是每5000个节点提交一次交易,并且完美无缺。明显的缺点是,当节点5001出现问题时,您无法回滚前5000个节点。

至于batchinserter。如果您使用程序导入一次性数据而不需要数据库可用于其他请求,则可以使用它。对于所有其他大型导入用例,batchinserter将无法帮助您。