Question

这是来自另一个SO（Neo4j 2.0 Merge with unique constraints performance bug?）的扩展，但我尝试的方式不同。

MATCH (c:Contact),(a:Address), (ca:ContactAddress)
WITH c,a,collect(ca) as matrix
FOREACH (car in matrix | 
MERGE 
(c {ContactId:car.ContactId})
-[r:CONTACT_ADDRESS {ContactId:car.ContactId,AddressId:car.AddressId}]->
(a {AddressId:car.AddressId}))

因此，这会导致Neo4j服务器被锁定。我试图绕过原因我在查询背后的思考过程如下：

我想选择所有Contact和Address节点（以及ContactAddress节点）
我想遍历所有ContactAddress节点（包含Contact和Address之间的关系数据）并将Contact和Address节点相互关联。

当我运行上面的代码时，服务器处于大约40％的CPU并且内存继续攀升。浏览器连接断开后我停止了它（myserver：7474 / browser），重置我的数据库并再次尝试，这次使用以下内容：

match (c:Contact),(a:Address), (ca:ContactAddress)
WITH c,a,collect(distinct ca) as matrix
foreach (car in matrix | 
CREATE 
(c {ContactId:car.ContactId})
-[r:CONTACT_ADDRESS {ContactId:car.ContactId,AddressId:car.AddressId}]->
(a {AddressId:car.AddressId}))

相同的结果。锁定，断开Neo4j数据库，同时CPU保持挂钩并且RAM使用率持续攀升。这里有一个我没有看到的循环吗？

我也试过这个（同样的挂起）：

FOREACH(row in {PassedInList} | 
    MERGE (c:Contact {ContactId:row.ContactId})
    MERGE (a:Address {AddressId:row.AddressId})
    MERGE (c)-[r:CONTACT_ADDRESS]->(a)
    )

解决：

MATCH (ca:ContactAddress)
MATCH (c:Contact {ContactId:ca.ContactId}), (a:Address {AddressId:ca.AddressId})
MERGE p = (c)
          -[r:CONTACT_ADDRESS {ContactId:ca.ContactId,AddressId:ca.AddressId}]->
          (a)

Answer 1

当你用3个断开的节点写match (c:Contact),(a:Address), (ca:ContactAddress)时，Neo4j将匹配那些3的每个可能的笛卡尔积。如果你有100个每种类型的节点，那就是100x100x100 = 1000000结果。

试试这个：

MATCH (ca:ContactAddress), (c:Contact {ContactId:ca.ContactId}), (a:Address {AddressId:ca.AddressId})
MERGE (c)-[r:CONTACT_ADDRESS {ContactId:ca.ContactId,AddressId:ca.AddressId}]->(a)

这将匹配每个:ContactAddress节点，并且只匹配与之匹配的:Contact和:Address个节点。然后它将创建关系（如果它尚不存在）。

如果你想更清楚，你也可以拆分MATCH，即：

MATCH (ca:ContactAddress)
MATCH (c:Contact {ContactId:ca.ContactId}), (a:Address {AddressId:ca.AddressId})
MERGE (c)-[r:CONTACT_ADDRESS {ContactId:ca.ContactId,AddressId:ca.AddressId}]->(a)

循环遍历Neo4j中的节点并创建关系

1 个答案: