Question

我有一个包含数百万个节点和关系的图表。我需要获得两个节点之间的所有路径。在我的例子中，关系路径中的所有节点必须是相同的标签

我这样的查询;

match (n:Label) match (m:Label) where n.NAME='foo' and m.NAME='foo2'
match p=(n)-[r*..20]-(m) where all(x in nodes(p) where (x:Label))
with p
return p, EXTRACT(x IN nodes(p) | x.NAME), EXTRACT(r IN relationships(p) | type(r))

带标签的节点数＆＃34;标签＆＃34;大概是20，但是这个查询在所有图形中遍历，以找到两个节点之间的所有可能路径，然后尝试用我的＆＃34;减少路径，其中所有＆＃34;条款。它然后崩溃了我的数据库。

我需要使用标签名称＆＃34;标签＆＃34;和他们的关系，然后查询子图之间的路径，以降低成本。

Answer 1

有许多Path Expander个APOC程序应该有用，因为许多程序允许您在生成路径时指定标签过滤器。

例如：

MATCH (n:Label {NAME: 'foo'})
WITH COLLECT(n) AS ns1
MATCH (m:Label {NAME: 'foo2'})
WITH ns1 + COLLECT(m) AS startNodes
CALL apoc.path.expandConfig(
  startNodes,
  {labelFilter: '+Label', minLevel: 1, maxLevel: 20}
) YIELD path
RETURN
  path,
  [x IN nodes(path) | x.NAME] AS names,
  [r IN relationships(path) | TYPE(r)] AS types;

Answer 2

尝试在数百万个节点的图形中查找所有可能的路径（即使长达20个）可能会导致内存溢出。

您可以做的是将其细分为更小的部分。查询不会那么优雅，但应该有效。

例如，如果我们一次执行5个路径长度，则两段查询将如下所示：

 MATCH p1 = (n1:Label)-[r1*..5]-(n2:Label), p2 = (n2:Label)-[r2*..5]-(n3:Label) 
 WHERE all(x1 in nodes(p1) WHERE (x1:Label)) 
 AND all(x2 in nodes(p2) WHERE (x2:Label)) 
 RETURN r1, r2

此查询的成本计划如下所示：

 +-----------------------+----------------------+---------------------+
 | Operator              | Variables            | Other               |
 +-----------------------+----------------------+---------------------+
 | +ProduceResults       | r1                   | r1                  |
 | |                     +----------------------+---------------------+
 | +Filter               | n1, n2, n3, r1, r2   | [see below]         |
 | |                     +----------------------+---------------------+
 | +VarLengthExpand(All) | n1, r1 -- n2, n3, r2 | (n2)-[r1:*..5]-(n1) |
 | |                     +----------------------+---------------------+
 | +Filter               | n2, n3, r2           | n3:Label            |
 | |                     +----------------------+---------------------+
 | +VarLengthExpand(All) | n3, r2 -- n2         | (n2)-[r2:*..5]-(n3) |
 | |                     +----------------------+---------------------+
 | +NodeByLabelScan      | n2                   | :Label              |
 +-----------------------+----------------------+---------------------+

因此，您可以在第一次展开后直接看到过滤器将过滤任何不以:Label开头和结尾的路径，然后才会进行第二次展开。

您的Neo版本为2.2或更高，p1和p2 will not include the same relationships。

您实际上可以在ProduceResults运算符（第二行）之前的过滤器中看到此过滤：

all(x1 in NodesFunction(ProjectedPath(Set(r1, n1),)) where x1:Label) 
AND none(r1 in r1 where any(r2 in r2 where r1 == r2)) 
AND n1:Label

现在您还应该看到我们只检查最后一个过滤器上路径中的所有节点。所以这样的路径：（a：标签） - （b：Blah） - （c：Label）仍然会经过第一段，只在结果生成之前被过滤。

因此，您可以通过检查所有段节点是否具有:Label来进一步优化，然后手动检查过去关系类似的关系。只显示第二阶段：

WITH n2, r1
MATCH p2 = (n2:Label)-[r2*..5]-(n3:Label)
WHERE all(x2 in nodes(p2) WHERE (x2:Label))
AND none(r1 in r1 where any(r2 in r2 where r1 == r2))

我忘了提及，但请记住，这样的查询是以懒惰的方式执行的。

Cypher用于两个节点之间的所有路径

2 个答案: