Question

db中有15M节点和150M关系，我运行以下cypher，获得结果需要200多个secondes。机器CPU和内存很低。我该怎么做才能改善？我很欣赏一些建议。

暗号：

START a=node:node_auto_index(userId='32887522') 
MATCH a -[:RELATIONSHIP_TYPE_FRIEND]- b -[:RELATIONSHIP_TYPE_FRIEND]- c 
WHERE NOT(a -[:RELATIONSHIP_TYPE_FRIEND]- c) AND NOT(a=c) 
RETURN c.userId as userId, COUNT(b) AS commonFriends 
ORDER BY commonFriends DESC 
LIMIT 100;

执行计划

ColumnFilter(symKeys=["userId", "  INTERNAL_AGGREGATE24121597-f14e-4ddf-b29d-0e3397500829"], returnItemNames=["userId", "commonFriends"], _rows=100, _db_hits=0)

==> Top(orderBy=["SortItem(Cached(  INTERNAL_AGGREGATE24121597-f14e-4ddf-b29d-0e3397500829 of type Long),false)"], limit="Literal", _rows=100, _db_hits=0)

==>   EagerAggregation(keys=["Cached(userId of type Any)"], aggregates=["(  INTERNAL_AGGREGATE24121597-f14e-4ddf-b29d-0e3397500829,Count)"], _rows=3656, _db_hits=0)

==>     Extract(symKeys=["  UNNAMED60", "a", "b", "  UNNAMED92", "c"], exprKeys=["userId"], _rows=15416, _db_hits=15416)

==>       Filter(pred="(NOT(nonEmpty(a-[  UNNAMED137:RELATIONSHIP_TYPE_FRIEND]-c)) AND NOT(a == c))", _rows=15416, _db_hits=0)

==>         TraversalMatcher(trail="(a)-[  UNNAMED60:RELATIONSHIP_TYPE_FRIEND WHERE true AND true]-(b)-[  UNNAMED92:RELATIONSHIP_TYPE_FRIEND WHERE true AND true]-(c)", _rows=15470, _db_hits=15547)

==>           ParameterPipe(_rows=1, _db_hits=0)

Answer 1

在Neo4j 2.2上尝试一个看起来更像这样的查询：

START me=node:node_auto_index(userId='32887522')
MATCH (me)-[:RELATIONSHIP_TYPE_FRIEND]-(people)
WITH me, COLLECT(people) as friends
MATCH (me)-[:RELATIONSHIP_TYPE_FRIEND]-(people)-[:RELATIONSHIP_TYPE_FRIEND]-(fof)
WHERE me <> fof 
WITH me, fof, COUNT(*) AS freq, friends
WHERE NOT (fof IN friends)
WITH fof, freq
RETURN fof.userId, freq
ORDER BY freq DESC 
LIMIT 10

它接近最佳的Java方式=>＆gt; http://maxdemarzi.com/2014/04/24/translating-cypher-to-neo4j-java-api-2-0/

Answer 2

我将密码简化为：

START me = node：node_auto_index（userId ='32887522'）

MATCH（me） - [：RELATIONSHIP_TYPE_FRIEND] - （人物） - [：RELATIONSHIP_TYPE_FRIEND] - （fof）

RETURN fof，count（*）AS commonFriends

通过commonFriends DESC订购

限制100;

并获得执行计划：

ColumnFilter（symKeys = [“fof”，“INTERNAL_AGGREGATEeaff758c-8eda-498a-9366-9965f62d16fc”]，returnItemNames = [“fof”，“commonFriends”]，_rows = 100，_ db_hits = 0）

==＆GT;顶部（orderBy = [“SortItem（缓存（Long的类型为INTERNAL_AGGREGATEeaff758c-8eda-498a-9366-9965f62d16fc），false）”]，limit =“Literal”，_ = 100，_ db_hits = 0）

==＆GT; EagerAggregation（keys = [“fof”]，aggregates = [“（INTERNAL_AGGREGATEeaff758c-8eda-498a-9366-9965f62d16fc，CountStar）”]，_ = 3661，_ db_hits = 0）

==＆GT; TraversalMatcher（trail =“（我） - [UNNAMED62：RELATIONSHIP_TYPE_FRIEND WHERE true AND true] - （人） - [UNNAMED99：RELATIONSHIP_TYPE_FRIEND WHERE true AND true] - （fof）”，_ _ = = 15470，_db_hits = 15547）

==＆GT; ParameterPipe（_rows = 1，_db_hits = 0）

它仍然很慢。不知道为什么？

我的neo4j cypher非常慢。如何提高？

2 个答案: