如何优化多节点Neo4J查询?

时间:2016-12-21 05:23:53

标签: neo4j cypher

我是Cypher查询的新手。以下查询已在过去几个小时内运行。尝试通过在中间使用“D”和“R”节点来推断“T”节点之间的关系。想要了解是否有更好的方法来编写它。

MATCH (t:T)-[r1:T_OF]->(d:D)<-[r2:R_OF]-(m:R)-[r3:R_OF]->(e:D)<-[r4:T_OF]-(u:T) 
WHERE t.name <> u.name AND d.name <> e.name 
RETURN t.name, u.name, count(*) as degree 
ORDER BY degree desc

这是每个节点和关系类型的计数 -

节点
T:4,657
D:2,458,733
R:4,822

的关系
T_OF:4,915,004
R_OF:284,548

1 个答案:

答案 0 :(得分:1)

你可以添加一个子句来避免同时计算(t,u)和(u,t),这会将笛卡尔积的大小减少一半:

MATCH (t:T)-[:T_OF]->(d:D)<-[:R_OF]-(:R)-[:R_OF]->(e:D)<-[:T_OF]-(u:T) 
WHERE id(t) < id(u)
  AND t.name <> u.name
  AND d.name <> e.name 
RETURN t.name, u.name, count(*) AS degree 
ORDER BY degree DESC

或者

MATCH (t:T)-[:T_OF]->(d:D)<-[:R_OF]-(:R)-[:R_OF]->(e:D)<-[:T_OF]-(u:T) 
WHERE t.name < u.name
  AND d.name <> e.name 
RETURN t.name, u.name, count(*) AS degree 
ORDER BY degree DESC

不会花费额外的ID读取费用。

它可能没有什么区别,但你也可以避免使用你不能使用的变量(r1r2r3,{{ 1}},r4)。

如果您没有匹配的数据且无法m,则很难优化查询。但是,我发现你拥有的PROFILE关系比你T_OF要多得多,所以如果你改变了遍历修剪分支的遍历顺序,那么可能会这样:

R_OF

甚至

MATCH (m:R)-[:R_OF]->(d:D)<-[:T_OF]-(t:T)
MATCH (m)-[:R_OF]->(e:D)<-[:T_OF]-(u:T) 
WHERE id(t) < id(u)
  AND t.name <> u.name
  AND d.name <> e.name 
RETURN t.name, u.name, count(*) AS degree 
ORDER BY degree DESC

您也可以尝试使用相同的MATCH (m:R)-[:R_OF]->(d:D) MATCH (m)-[:R_OF]->(e:D) WHERE d.name <> e.name MATCH (d:D)<-[:T_OF]-(t:T), (e:D)<-[:T_OF]-(u:T) WHERE id(t) < id(u) AND t.name <> u.name RETURN t.name, u.name, count(*) AS degree ORDER BY degree DESC 技巧(或命名名称)来减小第一个笛卡尔积的大小,但是最后需要重新组合这对夫妇:

id()

所有这些可能性都需要进行分析(在一个较小的集合上,或使用MATCH (m:R)-[:R_OF]->(d:D) MATCH (m)-[:R_OF]->(e:D) WHERE id(d) < id(e) AND d.name <> e.name MATCH (d:D)<-[:T_OF]-(t:T), (e:D)<-[:T_OF]-(u:T) WHERE t.name <> u.name WITH t.name AS name1, u.name AS name2, count(*) AS degree WITH CASE WHEN name1 < name2 THEN name1 ELSE name2 END AS name1, CASE WHEN name1 < name2 THEN name2 ELSE name1 END AS name2, degree RETURN name1, name2, sum(degree) AS degree ORDER BY degree DESC 来获得计划,但这只是理论和简介更有趣),看看是否他们在任何地方都能领先。