Question

我有一个neo4j数据库~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

example_nodes：sourceId，targetId
包含sourceId和targetId

我正在尝试建立所有节点之间的关系，但我不断遇到OOM问题。在具有16G RAM的系统上，我将JVM堆大小增加到-Xmx4096m和dbms.memory.pagecache.size=16g。

我假设我需要优化我的查询，因为它无法以任何当前形式完成。但是，我尝试了以下三个无济于事：

MATCH (start:example_nodes),(end:example_nodes) WHERE start.targetId = end.sourceId CREATE (start)-[r:CONNECT]->(end) RETURN r

（在5000个节点的子集上，上面的查询仅在几秒钟内完成。它当然警告：此查询在断开连接的模式之间构建笛卡尔积。）

MATCH (start:example_nodes) WITH start MATCH (end:example_nodes) WHERE start.targetId = end.sourceId CREATE (start)-[r:CONNECT]->(end) RETURN r

OPTIONAL MATCH (start:example_nodes) WITH start MATCH (end:example_nodes) WHERE start.targetId = end.sourceId CREATE (start)-[r:CONNECT]->(end) RETURN r

非常感谢任何关于如何优化此查询以获得成功的想法。

-

修改

在很多方面我觉得虽然apoc库确实解决了内存问题，但是如果要运行这个非常简单的伪代码行，那么函数可以进行优化：

for each start_gene
create relationship to end_gene where start_gene.targetId = end_gene.source_id
move on to next once relationship has been created

但我不确定如何在密码中实现这一点。

Answer 1

您可以使用Example of using Pinterest API库进行批处理。

call apoc.periodic.commit("
MATCH (start:example_nodes),(end:example_nodes) WHERE not (start)-[:CONNECT]->(end) and id(start) > id(end) AND start.targetId = 
end.sourceId 
with start,end limit {limit}
CREATE (start)-[:CONNECT]->(end) 
RETURN count(*)
",{limit:5000})

neo4j - 在数据库中的所有节点之间创建关系（内存不足）

1 个答案: