Neo4j没有使用索引

时间:2015-06-05 13:33:08

标签: neo4j

目前我在Neo4j 2.2.2上尝试此查询

在本文发表时,我们尚未标记任何节点,因为我们最近从Neo4j 1.x升级。所以我们没有选择使用USING条款。

我正在尝试使用索引,但最终会使用全表扫描。

START pfComp=node:Company('id:2403226') , ptComp=node:Company('id:1946633')
OPTIONAL MATCH
     (pfComp)<-[c:CHILD_OF*]-(cfComp)
WITH collect(id(cfComp)) as cfCompIds, ptComp, pfComp
OPTIONAL MATCH
                (ptComp)<-[c2:CHILD_OF*]-(ctComp)
WITH  cfCompIds, collect(id(ctComp)) AS ctCompIds
MATCH
                (fComp) -[fR:PARTICIPATES_IN]->  cdeals <-[tR:PARTICIPATES_IN]-(tComp)
WHERE
                (fComp.id = 2403226 or id(fComp) in  cfCompIds) and
                (tComp.id = 1946633 or id(tComp) in ctCompIds)
RETURN fComp, tComp, cdeals

Cypher版本:CYPHER 2.2,计划者:COST。 1305292总db命中率为79128毫秒。

对此的任何帮助将不胜感激。

以下是完整的配置文件命令输出。

This is the explain plan for the query

查询的开头部分执行速度很快:

profile START pfComp=node:Company('id:2403226') , ptComp=node:Company('id:1946633')
OPTIONAL MATCH
     (pfComp)<-[c:CHILD_OF*]-(cfComp)
WITH collect(id(cfComp)) as cfCompIds, ptComp, pfComp
OPTIONAL MATCH
                (ptComp)<-[c2:CHILD_OF*]-(ctComp)
return   cfCompIds, collect(id(ctComp)) AS ctCompIds

Cypher版本:CYPHER 2.2,计划者:COST。总共836次点击命中率为582毫秒。

enter image description here

3 个答案:

答案 0 :(得分:2)

您的第二部分看起来像关系联接或其他查找(如n + 1选择)。 也许使用图模型代替?而且查询也变得更简单。

因此,您可以使用初始匹配计算fComptComp,因为*0..它包含pfComp和ptComp。

然后你在最后一场比赛的fComp和tComp之间进行交叉。

请尝试一下,看看它的票价:

MATCH (pfComp:lCompany)<-[c:CHILD_OF*0..]-(fComp:lCompany)
WHERE pfComp.id = 2403226
// reduce cardinality for following match
WITH collect(distinct fComp) as companies1
MATCH (ptComp:lCompany)<-[c2:CHILD_OF*]-(tComp:lCompany)
WHERE ptComp.id = 1946633
// create cross product between fComp and tComp
UNWIND companies1 as fComp
MATCH (fComp) -[fR:PARTICIPATES_IN]->(cdeals)<-[tR:PARTICIPATES_IN]-(tComp)
RETURN  fComp, tComp, cdeals;

答案 1 :(得分:1)

您的个人资料中有索引查询(NodeByQueryIndex)。

您可以定义要在查询中使用的索引:

MATCH (n:Swedish)
USING INDEX n:Swedish(surname)
WHERE n.surname = 'Taylor'
RETURN n

请参阅http://neo4j.com/docs/stable/query-using.html

答案 2 :(得分:1)

我们的解决方案是创建标签(lCompany)并在Company.id列上添加更新的索引类型(CREATE INDEX ON:lCompany(id))。

然后调整查询以使用新索引:

OPTIONAL MATCH
     (pfComp:lCompany)<-[c:CHILD_OF*]-(cfComp:lCompany)
WHERE pfComp.id = 2403226
WITH 
    collect(cfComp.id) as cfCompIds
    , pfComp
OPTIONAL MATCH
        (ptComp:lCompany)<-[c2:CHILD_OF*]-(ctComp:lCompany)
WHERE ptComp.id = 1946633
WITH  cfCompIds, 
      collect(ctComp.id)  AS ctCompIds, 
      pfComp, ptComp 
MATCH
      (fComp:lCompany) -[fR:PARTICIPATES_IN]->  cdeals <-[tR:PARTICIPATES_IN]-(tComp:lCompany)
USING INDEX fComp:lCompany(id) //tComp:lCompany(id)
WHERE
                (   
                    fComp.id in (cfCompIds + [2403226])
                ) 
                and
                (
                    tComp.id in (ctCompIds + [1946633])
                )
RETURN  fComp, tComp, cdeals

可能会有进一步的优化,但到目前为止,这已经到了。

分析器结果现在是:

Cypher版本:CYPHER 2.2,计划者:COST。在1498毫秒内有134151个db的总命中率。

这是调整后的新配置文件:

enter image description here