我正在尝试使用LOAD CSV
这样在neo4j中加载相当大的文件(约2亿行)
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///home/manu/citation.csv.gz' AS line
MATCH (origin:`publication` {`id`: line.`cite_from`})
MATCH (destination:`publication` {`id`: line.`cite_to`})
MERGE (origin )-[rel:CITES ]->(destination );
但是我仍然看到内存错误,例如
raise CypherError.hydrate(**metadata)
neo4j.exceptions.TransientError: There is not enough memory to perform
the current task. Please try increasing 'dbms.memory.heap.max_size' in
the neo4j configuration (normally in 'conf/neo4j.conf' or, if you you
are using Neo4j Desktop, found through the user interface) or if you
are running an embedded installation increase the heap by using '-Xmx'
command line flag, and then restart the database.
在运行代码时以及在服务器中
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "neo4j.StorageMaintenance-14"
2018-12-05 15:44:32.967+0000 WARN Java heap space
java.lang.OutOfMemoryError: Java heap space
2018-12-05 15:44:32.968+0000 WARN Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$2@b6328a3 in QueuedThreadPool[qtp483052300]@1ccacb0c{STARTED,8<=8<=14,i=1,q=0}[ReservedThreadExecutor@f5cbd17{s=0/1,p=0}]
Exception in thread "neo4j.ServerTransactionTimeout-6" Exception in thread "neo4j.TransactionTimeoutMonitor-11" java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap
当然,我尝试设置此dbms.memory.heap.max_size
(最大24 GB ...以上,我的32 GB机器甚至无法启动neo4j),但仍然可以使用。我不太明白的是:如果neo4j试图一次加载所有内容,那么USING PERIODIC COMMIT
部分的目的是什么?在查看manual或例如this thread时,您会认为USING PERIODIC COMMIT
可以完全解决我遇到的问题。
有任何线索吗?想到的唯一解决方法是将文件分成几部分,但这看起来并不像一个优雅的解决方案(同样,如果可行,neo4j不能为我透明地做到这一点吗?)
编辑:使用EXPLAIN
干杯。
答案 0 :(得分:0)
可能是比“解决方案”更多的解决方法,但是在对那个 cypher 查询进行了广泛检查的属性上放置UNIQUE约束对我来说是成功的秘诀:
CREATE CONSTRAINT ON (p:publication) ASSERT p.id IS UNIQUE