这个密码查询非常慢,有没有优化?

时间:2016-03-15 19:14:44

标签: neo4j cypher

使用 Neo4J 2.1.5

数据:

2000人 目标:为每个人计算朋友,朋友的总数。朋友,朋友'朋友'朋友。
结果如下:
人FullName |朋友总计|朋友-2总计|朋友-3总计|全球总数

MATCH (person:Person)
WITH person
OPTIONAL MATCH person-[:KNOWS]-(p2:Person)
WITH person, count(p2) as f1
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..2]-(f2:Person))
WHERE length(path) = 2
WITH count(nodes(path)[-1]) AS f2, person, f1
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..3]-(f3:Person))
WHERE length(path) = 3
WITH count(nodes(path)[-1]) AS f3, person, f2, f1
RETURN person._firstName + " " + person._lastName, f1, f2, f3, f1+f2+f3 AS total

技巧是避免使用循环图进行错误的计算;这就是我使用shortestPath的原因。

但是,这个查询持续时间很长:60秒! 有可能进行任何优化吗?

1 个答案:

答案 0 :(得分:1)

[EDITED]

这对你有用吗?

MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(p1:Person)
WITH person, COALESCE(COLLECT(p1),[]) AS p1s 
WITH person, CASE p1s WHEN [] THEN [NULL] ELSE p1s END AS p1s
UNWIND p1s AS p1
OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person)
WHERE NOT ((p2 = person) OR (p2 IN p1s))
WITH person, p1s, COALESCE(COLLECT(DISTINCT p2),[]) AS p2s
WITH person, p1s, CASE p2s WHEN [] THEN [NULL] ELSE p2s END AS p2s UNWIND p2s AS p2
OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person)
WHERE NOT ((p3 = person) OR (p3 IN p1s) OR (p3 IN p2s))
WITH person,
  CASE p1s WHEN [NULL] THEN 0 ELSE SIZE(p1s) END AS f1,
  CASE p2s WHEN [NULL] THEN 0 ELSE SIZE(p2s) END AS f2,
  COUNT(DISTINCT p3) AS f3
RETURN person.firstName + " " + person.lastName, f1, f2, f3, f1+f2+f3 AS total;

每位朋友只计算一次。

以下是对一些比较模糊的策略的解释。查询必须使用p1s替换空的p2s[NULL]集合,以便UNWIND不会中止查询的其余部分。然后,在计算集合的大小时,我们需要为[NULL]个集合提供0的计数。