Neo4j Legacy关系自动索引在cypher查询中慢

时间:2015-04-04 21:29:43

标签: indexing neo4j cypher relationship timing

节点

1000000 x ({prop:'a'})
1000000 x ({prop:'b'})
1000000 x ({prop:'c'})

NODE SET =〜3MegaNodes

Obs。:prop不是一个独家属性。


关系

1000 x [:TYPEA {date:20150301} ]
1000 x [:TYPEA {date:20150228} ]
1000 x [:TYPEA {date:20150227} ]
1000 x [:TYPEA {date:........} ]
1000 x [:TYPEA {date:19000101} ]

1000 x [:TYPEB {date:20150301} ]
1000 x [:TYPEB {date:20150228} ]
1000 x [:TYPEB {date:20150227} ]
1000 x [:TYPEB {date:........} ]
1000 x [:TYPEB {date:19000101} ]

TYPEA = 42062天x 1 000 rels

TYPEA = ~42 000 000

TYPEB = ~42 000 000

关系集 = ~84 MegaRels


我想匹配模式:

MATCH (n1 {prop:'a'}) -[ r1:TYPEA {date:20001231} ]-> (n2 {prop:'b'})
RETURN n2;

通过索引改进

我的neo4j.properties:

relationship_auto_indexing=true
relationship_keys_indexable=date

密码查询:

START 
  r1 = relationship:relationship_auto_index('date:20001231')
MATCH (n1 {prop:'a'}) -[r1:TYPEA]-> (n2 {prop:'b'})
RETURN n2;

:)工作正常!


现在,我想匹配模式:

MATCH
  (n1 {prop:'a'})
  -[ r1:TYPEA {date:20001231} ]->
  (n2 {prop:'b'})
  -[ r2:TYPEA {date:20001231} ]->
  (n3  {prop:'c'})
RETURN n2, n3;

然后我尝试:

START 
  r1 = relationship:relationship_auto_index('date:20001231'),
  r2 = relationship:relationship_auto_index('date:20001231')
MATCH (n1 {prop:'a'}) -[r1:TYPEA]-> (n2 {prop:'b'}) -[r2:TYPEA]-> (n3 {prop:'c'})
RETURN DISTINCT n2,  n3;

:(慢跑。


因为笛卡尔积产生了许多中间结果。 1000 ^ 2.

一方面,在查询中不可能多次使用相同的标识符。

另一方面,标签索引(架构)不适用于关系。

有希望吗? (发布:Neo4j-community-2.2.0)

在查询密码中不使用子句start时,关系遗留索引有什么好处?

感谢名单

1 个答案:

答案 0 :(得分:1)

这会修改概念查询,但工作正常:

START 
  r = relationship:relationship_auto_index('date:20001231')
WITH [x IN COLLECT(r) WHERE TYPE(x)='TYPEA'] AS cr
UNWIND cr AS r1
  MATCH (n1 {prop:'a'}) -[r1]-> (n2 {prop:'b'})
WITH DISTINCT n2, cr
UNWIND cr AS r2
  MATCH (n2) -[r2]-> (n3 {prop:'c'})  
RETURN DISTINCT n2,  n3;

THX