Question

我试图根据用户在graph中的共同兴趣来比较用户我知道为什么以下查询会产生重复对，但是在cypher中无法想到避免它的好方法。如果没有在密码中循环，有没有办法做到这一点？

neo4j-sh (?)$ start n=node(*) match p=n-[:LIKES]->item<-[:LIKES]-other where n <> other return n.name,other.name,collect(item.name) as common, count(*) as freq order by freq desc;
==> +-----------------------------------------------+
==> | n.name | other.name | common           | freq |
==> +-----------------------------------------------+
==> | "u1"   | "u2"       | ["f1","f2","f3"] | 3    |
==> | "u2"   | "u1"       | ["f1","f2","f3"] | 3    |
==> | "u1"   | "u3"       | ["f1","f2"]      | 2    |
==> | "u3"   | "u2"       | ["f1","f2"]      | 2    |
==> | "u2"   | "u3"       | ["f1","f2"]      | 2    |
==> | "u3"   | "u1"       | ["f1","f2"]      | 2    |
==> | "u4"   | "u3"       | ["f1"]           | 1    |
==> | "u4"   | "u2"       | ["f1"]           | 1    |
==> | "u4"   | "u1"       | ["f1"]           | 1    |
==> | "u2"   | "u4"       | ["f1"]           | 1    |
==> | "u1"   | "u4"       | ["f1"]           | 1    |
==> | "u3"   | "u4"       | ["f1"]           | 1    |
==> +-----------------------------------------------+

Answer 1

为了避免以a--b和b--a的形式出现重复项，您可以使用

排除WHERE子句中的一个组合

WHERE ID(a) < ID(b)

进行上述查询

start n=node(*) match p=n-[:LIKES]->item<-[:LIKES]-other where ID(n) < ID(other) return n.name,other.name,collect(item.name) as common, count(*) as freq order by freq desc;

Answer 2

好的，我看到你使用（*）作为起点，这意味着循环遍历整个图并使每个节点作为一个起始点。所以输出是不同的，不像你说的那样重复.. < / p>

+-----------------------------------------------+
| n.name | other.name | common           | freq |
+-----------------------------------------------+
| "u2"   | "u1"       | ["f1","f2","f3"] | 3    |

不等于：

+-----------------------------------------------+
| n.name | other.name | common           | freq |
+-----------------------------------------------+
| "u1"   | "u2"       | ["f1","f2","f3"] | 3    |

所以，我看到如果你尝试使用索引并设置一个起点，就不会有任何重复。

start n=node:someIndex(name='C') match p=n-[:LIKES]->item<-[:LIKES]-other where n <> other return n.name,other.name,collect(item.name) as common, count(*) as freq order by freq desc;

当对序在密码中不重要时，查询唯一的节点对

2 个答案: