Neo4j cypher查询在双向关系中太慢(太多dbhits)

时间:2015-06-12 04:56:44

标签: neo4j cypher

拥有流动的数据模型:

(Phone{phoneNumber})-[:CALL]-(Phone{phoneNumber})        
(Person{personId})-[:KEEP]-(Phone{personId})    
(Case{caseId})-[:INVOLVE]-(Person{personId}) 

所有这三个都使用双向关系。并在phoneNumber / personId / caseId上创建了索引。

用户可以输入一个或多个字符串,可能表示为phoneNumber / caseId / personId查询他们的关系(考虑方向和关系深度可以是1到4)。

这是密码查询:

match p = n-[r*1..4]-m 
with n,m,p 
where (n.phoneNumber in ["xxx","yyy"] 
       or n.caseSjNo in ["xxx","yyy"] 
       or n.identificationNumber in ["xxx","yyy"]) 
  and (m.phoneNumber in ["xxx","yyy"] 
       or m.caseSjNo in ["xxx","yyy"] 
       or m.identificationNumber in ["xxx","yyy"])
  and n <> m 
return p limit 1000 

我在shell控制台中分析了这个查询字符串。在neo4j db中有10,000个节点,我发现 Dbhits 非常棒。这是结果(深度= 1和深度= 4):

neo4j-sh (?)$ profile match p = n-[r*1..1]-m with n,m,p where (n.phoneNumber in ["XXX","YYY"] or n.caseSjNo in ["XXX","YYY"] or n.identificationNumber in ["XXX","YYY"]) and (m.phoneNumber in ["XXX","YYY"]                       or m.caseSjNo in ["XXX","YYY"]                       or m.identificationNumber in ["XXX","YYY"]) and n <> m return p limit 1000;
==> +---+
==> | p |
==> +---+
==> +---+
==> 0 row
==> 
==> ColumnFilter(0)
==>   |
==>   +Slice
==>     |
==>     +Filter
==>       |
==>       +ColumnFilter(1)
==>         |
==>         +ExtractPath
==>           |
==>           +TraversalMatcher
==> 
==> +------------------+-------+--------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
==> |         Operator |  Rows | DbHits | Identifiers |ther |

==> |  ColumnFilter(0) |     0 |      0 |             |keep columns p |
==> |            Slice |     0 |      0 |             |{  AUTOINT12} |
==> |           Filter |     0 | 480776 |             | ((((any(-_-INNER-_- in Collection(List({  AUTOSTRING0}, {  AUTOSTRING1})) where Property(n,phoneNumber(3)) == -_-INNER-_-) OR any(-_-INNER-_- in Collection(List({  AUTOSTRING2}, {  AUTOSTRING3})) where Property(n,caseSjNo(0)) == -_-INNER-_-)) OR any(-_-INNER-_- in Collection(List({  AUTOSTRING4}, {  AUTOSTRING5})) where Property(n,identificationNumber(2)) == -_-INNER-_-)) AND ((any(-_-INNER-_- in Collection(List({  AUTOSTRING6}, {  AUTOSTRING7})) where Property(m,phoneNumber(3)) == -_-INNER-_-) OR any(-_-INNER-_- in Collection(List({  AUTOSTRING8}, {  AUTOSTRING9})) where Property(m,caseSjNo(0)) == -_-INNER-_-)) OR any(-_-INNER-_- in Collection(List({  AUTOSTRING10}, {  AUTOSTRING11})) where Property(m,identificationNumber(2)) == -_-INNER-_-))) AND NOT(n == m)) |
==> |  ColumnFilter(1) | 20034 |      0 |             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                keep columns n, m, p |
==> |      ExtractPath | 20034 |      0 |           p |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
==> | TraversalMatcher | 20034 |  50152 |             |m,   UNNAMED11, m, r |

==> 
==> Total database accesses: 530928

------------------------------------------------------
------------------------------------------------------
neo4j-sh (?)$ profile match p = n-[r*1..4]-m with n,m,p where (n.phoneNumber in ["XXX","YYY"] or n.caseSjNo in ["XXX","YYY"] or n.identificationNumber in ["XXX","YYY"]) and (m.phoneNumber in ["XXX","YYY"]                       or m.caseSjNo in ["XXX","YYY"]                       or m.identificationNumber in ["XXX","YYY"]) and n <> m return p limit 1000 ;
==> +---+
==> | p |
==> +---+
==> +---+
==> 0 row
==> 
==> ColumnFilter(0)
==>   |
==>   +Slice
==>     |
==>     +Filter
==>       |
==>       +ColumnFilter(1)
==>         |
==>         +ExtractPath
==>           |
==>           +TraversalMatcher
==> 

==> |         Operator |    Rows |    DbHits | Identifiers |ther |

==> |  ColumnFilter(0) |       0 |         0 |             |keep columns p |
==> |            Slice |       0 |         0 |             |{  AUTOINT12} |
==> |           Filter |       0 | 120244220 |             | ((((any(-_-INNER-_- in Collection(List({  AUTOSTRING0}, {  AUTOSTRING1})) where Property(n,phoneNumber(3)) == -_-INNER-_-) OR any(-_-INNER-_- in Collection(List({  AUTOSTRING2}, {  AUTOSTRING3})) where Property(n,caseSjNo(0)) == -_-INNER-_-)) OR any(-_-INNER-_- in Collection(List({  AUTOSTRING4}, {  AUTOSTRING5})) where Property(n,identificationNumber(2)) == -_-INNER-_-)) AND ((any(-_-INNER-_- in Collection(List({  AUTOSTRING6}, {  AUTOSTRING7})) where Property(m,phoneNumber(3)) == -_-INNER-_-) OR any(-_-INNER-_- in Collection(List({  AUTOSTRING8}, {  AUTOSTRING9})) where Property(m,caseSjNo(0)) == -_-INNER-_-)) OR any(-_-INNER-_- in Collection(List({  AUTOSTRING10}, {  AUTOSTRING11})) where Property(m,identificationNumber(2)) == -_-INNER-_-))) AND NOT(n == m)) |
==> |  ColumnFilter(1) | 5010178 |         0 |             |keep columns n, m, p |
==> |      ExtractPath | 5010178 |         0 |           p ||
==> | TraversalMatcher | 5010178 |  20070774 |             |m,   UNNAMED11, m, r |
==> +------------------+---------+-----------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
==> 
==> Total database accesses: 140314994

虽然结果出来了,但这花了太长时间。任何查询优化的提示。

更新 在db中有1,000,000(1M)个节点时,发生内存不足错误。

1 个答案:

答案 0 :(得分:4)

为什么首先使用双向关系?在Neo4j中,您始终可以双向导航。

  1. 更新到最近的Neo4j版本(2.2.2)
  2. 使用单向关系
  3. 使用标签
  4. 创建标签索引
  5. 因为它将条件与模式
  6. 分开,所以在它们之间没有帮助
  7. 因为你有一个&#34;泛型&#34;实体概念(x可以是&#34;任何&#34;)我建议添加一个:实体标签,并在那里使用带有索引的id字段。
  8. 你真的对所有路径感兴趣吗?或者只是allShortestPaths?
  9. 请参阅:

    create index on :Entity(id);
    
    
    match (n:Entity),(m:Entity) 
    where n.id in ["xxx","yyy"] and m.id in ["xxx","yyy"] and n<>m
    match p = (n)-[r*1..4]-(m)
    return p 
    limit 1000
    

    否则将其拆分为6个不同的语句并使用联合。

       match p = (n:Person)-[r*1..4]-(m:Case) 
       where n.identificationNumber in ["xxx","yyy"] and m.caseSjNo in ["xxx","yyy"]
       return p limit 500
       UNION
       match p = (n:Person)-[r*1..4]-(m:Phone) 
       where n.identificationNumber in ["xxx","yyy"] and m.phoneNumber in ["xxx","yyy"]
       return p limit 500
       UNION
       ...