db中有15M节点和150M关系,我运行以下cypher,获得结果需要200多个secondes。机器CPU和内存很低。我该怎么做才能改善?我很欣赏一些建议。
暗号:
START a=node:node_auto_index(userId='32887522')
MATCH a -[:RELATIONSHIP_TYPE_FRIEND]- b -[:RELATIONSHIP_TYPE_FRIEND]- c
WHERE NOT(a -[:RELATIONSHIP_TYPE_FRIEND]- c) AND NOT(a=c)
RETURN c.userId as userId, COUNT(b) AS commonFriends
ORDER BY commonFriends DESC
LIMIT 100;
执行计划
ColumnFilter(symKeys=["userId", " INTERNAL_AGGREGATE24121597-f14e-4ddf-b29d-0e3397500829"], returnItemNames=["userId", "commonFriends"], _rows=100, _db_hits=0)
==> Top(orderBy=["SortItem(Cached( INTERNAL_AGGREGATE24121597-f14e-4ddf-b29d-0e3397500829 of type Long),false)"], limit="Literal", _rows=100, _db_hits=0)
==> EagerAggregation(keys=["Cached(userId of type Any)"], aggregates=["( INTERNAL_AGGREGATE24121597-f14e-4ddf-b29d-0e3397500829,Count)"], _rows=3656, _db_hits=0)
==> Extract(symKeys=[" UNNAMED60", "a", "b", " UNNAMED92", "c"], exprKeys=["userId"], _rows=15416, _db_hits=15416)
==> Filter(pred="(NOT(nonEmpty(a-[ UNNAMED137:RELATIONSHIP_TYPE_FRIEND]-c)) AND NOT(a == c))", _rows=15416, _db_hits=0)
==> TraversalMatcher(trail="(a)-[ UNNAMED60:RELATIONSHIP_TYPE_FRIEND WHERE true AND true]-(b)-[ UNNAMED92:RELATIONSHIP_TYPE_FRIEND WHERE true AND true]-(c)", _rows=15470, _db_hits=15547)
==> ParameterPipe(_rows=1, _db_hits=0)
答案 0 :(得分:1)
在Neo4j 2.2上尝试一个看起来更像这样的查询:
START me=node:node_auto_index(userId='32887522')
MATCH (me)-[:RELATIONSHIP_TYPE_FRIEND]-(people)
WITH me, COLLECT(people) as friends
MATCH (me)-[:RELATIONSHIP_TYPE_FRIEND]-(people)-[:RELATIONSHIP_TYPE_FRIEND]-(fof)
WHERE me <> fof
WITH me, fof, COUNT(*) AS freq, friends
WHERE NOT (fof IN friends)
WITH fof, freq
RETURN fof.userId, freq
ORDER BY freq DESC
LIMIT 10
它接近最佳的Java方式=>&gt; http://maxdemarzi.com/2014/04/24/translating-cypher-to-neo4j-java-api-2-0/
答案 1 :(得分:0)
我将密码简化为:
START me = node:node_auto_index(userId ='32887522')
MATCH(me) - [:RELATIONSHIP_TYPE_FRIEND] - (人物) - [:RELATIONSHIP_TYPE_FRIEND] - (fof)
RETURN fof,count(*)AS commonFriends
通过commonFriends DESC订购
限制100;
并获得执行计划:
ColumnFilter(symKeys = [“fof”,“INTERNAL_AGGREGATEeaff758c-8eda-498a-9366-9965f62d16fc”],returnItemNames = [“fof”,“commonFriends”],_rows = 100,_ db_hits = 0)
==&GT;顶部(orderBy = [“SortItem(缓存(Long的类型为INTERNAL_AGGREGATEeaff758c-8eda-498a-9366-9965f62d16fc),false)”],limit =“Literal”,_ = 100,_ db_hits = 0)
==&GT; EagerAggregation(keys = [“fof”],aggregates = [“(INTERNAL_AGGREGATEeaff758c-8eda-498a-9366-9965f62d16fc,CountStar)”],_ = 3661,_ db_hits = 0)
==&GT; TraversalMatcher(trail =“(我) - [UNNAMED62:RELATIONSHIP_TYPE_FRIEND WHERE true AND true] - (人) - [UNNAMED99:RELATIONSHIP_TYPE_FRIEND WHERE true AND true] - (fof)”,_ _ = = 15470,_db_hits = 15547)
==&GT; ParameterPipe(_rows = 1,_db_hits = 0)
它仍然很慢。不知道为什么?