Neo4j - 性能:查找具有与源节点

时间:2017-03-14 16:07:54

标签: performance neo4j cypher

我有以下要求:

  • 给定源节点
  • 查找特定范围内的所有节点(例如4个跃点)
  • 并且目标节点具有特殊标签“x”
  • 限制目标路径中节点的标签类型
  • 仅返回最短路径长度(例如,如果我找到一个有2个跃点的节点,也找不到3或4个跃点的节点)
  • 返回显示目标节点和
  • 之间路径所需的所有节点

我设法创建了一个查询,但性能不是很好。我认为这是因为标签“x”

的节点数量很大
MATCH path = allShortestPaths((source)-[*..4]-(destination))
WHERE source.objectID IN ['001614914']
AND source:Y
AND destination:X
AND ALL(x IN nodes(path)[1..] WHERE any(l in labels(x) WHERE l in ['A', 'B', 'C']))
WITH path
LIMIT 1000
WITH COLLECT(path) AS paths, MIN(length(path)) AS minLength 
WITH FILTER(p IN paths WHERE length(p)= minLength) AS pathList
LIMIT 25
UNWIND pathList as path
WITH [n in nodes(path)] as nodes
return nodes

资料: Profile with shortest path

如果我将查询更改为不使用最短路径功能,那么当源没有很多传出路径时,这很有效

MATCH path = ((source)-[*..4]-(destination))
WHERE source.objectID IN ['001614914']
AND source:Y
AND destination:X
AND ALL(x IN nodes(path)[1..] WHERE any(l in labels(x) WHERE l in ['A', 'B', 'C']))
WITH path
LIMIT 1000
WITH COLLECT(path) AS paths, MIN(length(path)) AS minLength 
WITH FILTER(p IN paths WHERE length(p)= minLength) AS pathList
LIMIT 25
UNWIND pathList as path
WITH [n in nodes(path)] as nodes
return nodes

资料: Profile without shortest path

但是,如果我有一个有很多孩子的源节点,那么性能也会很差......

在我想的时候,如果我开始简单搜索所有目的地并在每个找到的目的地上调用shortestPath,这可能会更好,但我不太确定。

e.g。

MATCH (source)-[*..4]-(destination)
WHERE source.objectID IN ['001614914']
AND source:Y
AND destination:X
WITH destination
LIMIT 100
call apoc (shortest path ...)
...

或者有更好的方法吗?

1 个答案:

答案 0 :(得分:2)

您可能想要使用' NODE_GLOBAL'来尝试使用APOC的路径扩展器。唯一性,它通常比可变长度匹配更好。它还具有在遍历期间将节点列入白名单的方法,但这也适用于起始节点,因此我们必须在白名单中包括:Y。

看看这是否适合你:

MATCH path = (source:Y)
WHERE source.objectID IN ['001614914']
CALL apoc.path.expandConfig(source, {labelFilter:'+A|B|C|Y', maxLevel:4, uniqueness:'NODE_GLOBAL'}) YIELD path
WITH path, last(nodes(path)) as destination
WHERE destination:X AND NONE(node in TAIL(nodes(path)) WHERE node:Y)
// all the rest is the same as your old query
WITH path
LIMIT 1000
WITH COLLECT(path) AS paths, MIN(length(path)) AS minLength 
WITH FILTER(p IN paths WHERE length(p)= minLength) AS pathList
LIMIT 25
UNWIND pathList as path
RETURN NODES(path) as nodes