我是Cypher查询的新手。以下查询已在过去几个小时内运行。尝试通过在中间使用“D”和“R”节点来推断“T”节点之间的关系。想要了解是否有更好的方法来编写它。
MATCH (t:T)-[r1:T_OF]->(d:D)<-[r2:R_OF]-(m:R)-[r3:R_OF]->(e:D)<-[r4:T_OF]-(u:T)
WHERE t.name <> u.name AND d.name <> e.name
RETURN t.name, u.name, count(*) as degree
ORDER BY degree desc
这是每个节点和关系类型的计数 -
节点的
T:4,657
D:2,458,733
R:4,822
的关系
T_OF:4,915,004
R_OF:284,548
答案 0 :(得分:1)
你可以添加一个子句来避免同时计算(t,u)和(u,t),这会将笛卡尔积的大小减少一半:
MATCH (t:T)-[:T_OF]->(d:D)<-[:R_OF]-(:R)-[:R_OF]->(e:D)<-[:T_OF]-(u:T)
WHERE id(t) < id(u)
AND t.name <> u.name
AND d.name <> e.name
RETURN t.name, u.name, count(*) AS degree
ORDER BY degree DESC
或者
MATCH (t:T)-[:T_OF]->(d:D)<-[:R_OF]-(:R)-[:R_OF]->(e:D)<-[:T_OF]-(u:T)
WHERE t.name < u.name
AND d.name <> e.name
RETURN t.name, u.name, count(*) AS degree
ORDER BY degree DESC
不会花费额外的ID读取费用。
它可能没有什么区别,但你也可以避免使用你不能使用的变量(r1
,r2
,r3
,{{ 1}},r4
)。
如果您没有匹配的数据且无法m
,则很难优化查询。但是,我发现你拥有的PROFILE
关系比你T_OF
要多得多,所以如果你改变了遍历修剪分支的遍历顺序,那么可能会这样:
R_OF
甚至
MATCH (m:R)-[:R_OF]->(d:D)<-[:T_OF]-(t:T)
MATCH (m)-[:R_OF]->(e:D)<-[:T_OF]-(u:T)
WHERE id(t) < id(u)
AND t.name <> u.name
AND d.name <> e.name
RETURN t.name, u.name, count(*) AS degree
ORDER BY degree DESC
您也可以尝试使用相同的MATCH (m:R)-[:R_OF]->(d:D)
MATCH (m)-[:R_OF]->(e:D)
WHERE d.name <> e.name
MATCH (d:D)<-[:T_OF]-(t:T), (e:D)<-[:T_OF]-(u:T)
WHERE id(t) < id(u)
AND t.name <> u.name
RETURN t.name, u.name, count(*) AS degree
ORDER BY degree DESC
技巧(或命名名称)来减小第一个笛卡尔积的大小,但是最后需要重新组合这对夫妇:
id()
所有这些可能性都需要进行分析(在一个较小的集合上,或使用MATCH (m:R)-[:R_OF]->(d:D)
MATCH (m)-[:R_OF]->(e:D)
WHERE id(d) < id(e)
AND d.name <> e.name
MATCH (d:D)<-[:T_OF]-(t:T), (e:D)<-[:T_OF]-(u:T)
WHERE t.name <> u.name
WITH t.name AS name1, u.name AS name2, count(*) AS degree
WITH CASE WHEN name1 < name2 THEN name1 ELSE name2 END AS name1,
CASE WHEN name1 < name2 THEN name2 ELSE name1 END AS name2,
degree
RETURN name1, name2, sum(degree) AS degree
ORDER BY degree DESC
来获得计划,但这只是理论和简介更有趣),看看是否他们在任何地方都能领先。