使用 Neo4J 2.1.5 。
数据:
2000人
目标:为每个人计算朋友,朋友的总数。朋友,朋友'朋友'朋友。
结果如下:
人FullName |朋友总计|朋友-2总计|朋友-3总计|全球总数
MATCH (person:Person)
WITH person
OPTIONAL MATCH person-[:KNOWS]-(p2:Person)
WITH person, count(p2) as f1
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..2]-(f2:Person))
WHERE length(path) = 2
WITH count(nodes(path)[-1]) AS f2, person, f1
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..3]-(f3:Person))
WHERE length(path) = 3
WITH count(nodes(path)[-1]) AS f3, person, f2, f1
RETURN person._firstName + " " + person._lastName, f1, f2, f3, f1+f2+f3 AS total
技巧是避免使用循环图进行错误的计算;这就是我使用shortestPath
的原因。
但是,这个查询持续时间很长:60秒! 有可能进行任何优化吗?
答案 0 :(得分:1)
[EDITED]
这对你有用吗?
MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(p1:Person)
WITH person, COALESCE(COLLECT(p1),[]) AS p1s
WITH person, CASE p1s WHEN [] THEN [NULL] ELSE p1s END AS p1s
UNWIND p1s AS p1
OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person)
WHERE NOT ((p2 = person) OR (p2 IN p1s))
WITH person, p1s, COALESCE(COLLECT(DISTINCT p2),[]) AS p2s
WITH person, p1s, CASE p2s WHEN [] THEN [NULL] ELSE p2s END AS p2s UNWIND p2s AS p2
OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person)
WHERE NOT ((p3 = person) OR (p3 IN p1s) OR (p3 IN p2s))
WITH person,
CASE p1s WHEN [NULL] THEN 0 ELSE SIZE(p1s) END AS f1,
CASE p2s WHEN [NULL] THEN 0 ELSE SIZE(p2s) END AS f2,
COUNT(DISTINCT p3) AS f3
RETURN person.firstName + " " + person.lastName, f1, f2, f3, f1+f2+f3 AS total;
每位朋友只计算一次。
以下是对一些比较模糊的策略的解释。查询必须使用p1s
替换空的p2s
和[NULL]
集合,以便UNWIND
不会中止查询的其余部分。然后,在计算集合的大小时,我们需要为[NULL]
个集合提供0
的计数。