使用Neo4j / Cypher查找所选子图的所有叶子

时间:2018-08-28 23:09:01

标签: neo4j cypher

初始情况

  • 具有树状结构(深10层,一千万个节点)的大型Neo4j 3.4.6图。
  • 无例外,所有节点都相互连接。节点和关系在每种情况下都是相同的类型。
  • 恰好是一个中央根节点。
  • 简化示例:

Graphic representation

CREATE (Root:CustomType {name: 'Root'})
CREATE (NodeA:CustomType {name: 'NodeA'})
CREATE (NodeB:CustomType {name: 'NodeB'})
CREATE (NodeC:CustomType {name: 'NodeC'})
CREATE (NodeD:CustomType {name: 'NodeD'})
CREATE (NodeE:CustomType {name: 'NodeE'})
CREATE (NodeF:CustomType {name: 'NodeF'})
CREATE (NodeG:CustomType {name: 'NodeG'})
CREATE (NodeH:CustomType {name: 'NodeH'})
CREATE (NodeI:CustomType {name: 'NodeI'})
CREATE (NodeJ:CustomType {name: 'NodeJ'})
CREATE (NodeK:CustomType {name: 'NodeK'})
CREATE (NodeL:CustomType {name: 'NodeL'})
CREATE (NodeM:CustomType {name: 'NodeM'})
CREATE (NodeN:CustomType {name: 'NodeN'})
CREATE (NodeO:CustomType {name: 'NodeO'})
CREATE (NodeP:CustomType {name: 'NodeP'})
CREATE (NodeQ:CustomType {name: 'NodeQ'})

CREATE
  (Root)-[:CONTAINS]->(NodeA),
  (Root)-[:CONTAINS]->(NodeB),
  (Root)-[:CONTAINS]->(NodeC),
  (NodeA)-[:CONTAINS]->(NodeD),
  (NodeA)-[:CONTAINS]->(NodeE),
  (NodeA)-[:CONTAINS]->(NodeF),
  (NodeE)-[:CONTAINS]->(NodeG),
  (NodeE)-[:CONTAINS]->(NodeH),
  (NodeF)-[:CONTAINS]->(NodeI),
  (NodeF)-[:CONTAINS]->(NodeJ),
  (NodeF)-[:CONTAINS]->(NodeK),
  (NodeI)-[:CONTAINS]->(NodeL),
  (NodeI)-[:CONTAINS]->(NodeM),
  (NodeJ)-[:CONTAINS]->(NodeN),
  (NodeK)-[:CONTAINS]->(NodeO),
  (NodeK)-[:CONTAINS]->(NodeP),
  (NodeM)-[:CONTAINS]->(NodeQ);

要解决的挑战

  • 通过MATCH-WITH-UNWIND Cypher查询,我可以成功选择一个子树并将其绑定到路径。假设子树跨越节点A,E,F,I和J。
  • 基于此路径,我需要子树的所有叶子,而不是现在的完整树。

MATCH
  path = (:CustomType {name:'NodeA'})-[:CONTAINS*]->(:CustomType {name:'NodeJ'}) /* simplified */
WITH
  nodes(path) as selectedPath
  /* here: necessary magic to identify the leaf nodes of the subtree */
RETURN
  leafNode;
  • 除其他外,我尝试使用WHERE NOT(node-->())方法来解决需求,但意识到这仅适用于完整树的叶子。不幸的是,我无法说服WHERE NOT(node-->())子句遵守所选的子树边界。
  • 那么,如何使用Cypher和Neo4j找到所选子图的所有叶子?您能给我建议如何解决这个挑战吗?在此先感谢您为我指明了正确的方向!

1 个答案:

答案 0 :(得分:1)

您正确地注意到没有子级的check节点仅适用于整个树。因此,您需要遍历子树中的所有关系,并找到子树的一个节点,该节点作为关系的结尾,而不是关系的起点:

MATCH
  path = (:CustomType {name:'NodeA'})-[:CONTAINS*]->(:CustomType {name:'NodeJ'})
UNWIND relationShips(path) AS r
WITH collect(DISTINCT endNode(r))   AS endNodes, 
     collect(DISTINCT startNode(r)) AS startNodes
UNWIND endNodes AS leaf
WITH leaf WHERE NOT leaf IN startNodes
RETURN leaf