我很简单(至少在理解方面)cypher查询在平均随机起始节点上花了10-15秒:
START hospital1 = Node:Hospitals(id="xxx")
MATCH (hospital1)-[:CHILD_PROVIDER]->(provider1)-[referral:REFERRED]-(provider2)<-[:CHILD_PROVIDER]-(hospital2)
WHERE hospital1 <> hospital2
RETURN hospital1, SUM(referral.count), hospital2;
PROFILE输出的一部分(跳过所有节点引用):
==> ColumnFilter(symKeys=["hospital1", "hospital2", " INTERNAL_AGGREGATEd314d0d2- c365-4373-8bc8-047a3824abc4"], returnItemNames=["hospital1", "SUM(referral.count)", "hospital2"], _rows=29, _db_hits=0)
==> EagerAggregation(keys=["hospital1", "hospital2"], aggregates=["( INTERNAL_AGGREGATEd314d0d2-c365-4373-8bc8-047a3824abc4,Sum(Product(referral,count(37),true)))"], _rows=29, _db_hits=1008)
==> Filter(pred="NOT(hospital1 == hospital2)", _rows=1008, _db_hits=0)
==> TraversalMatcher(trail="(hospital1)-[ UNNAMED77:CHILD_PROVIDER WHERE true AND true]->(provider1)-[referral:REFERRED WHERE true AND true]-(provider2)<-[ UNNAMED140:CHILD_PROVIDER WHERE true AND true]-(hospital2)", _rows=10192, _db_hits=54096)
(医院) - (提供者)类似于1:100-250
(提供者) - (提供者)从0到1:150左右不等
_db_hits = 54096对我来说看起来很高,我认为这是反应缓慢的原因。
我有两个主要问题:
1)我应该考虑添加预先计算的关系,还是可以用当前图形以某种方式提高性能?
2)一般来说,每个查询有多少次遍历操作?它在1:10:10:1看起来非常好,但是大小是好的迹象,表示需要进行一些更改(考虑到我需要从所有匹配关系中收集信息,而不是找到最短路径等)