迫使成本计划者从特定的索引搜索开始

时间:2018-12-01 18:22:33

标签: neo4j cypher

我的密码查询

EXPLAIN MATCH (b:Block)<-[:INCLUDED_IN]-(tx:Transaction {pstype: 0})
WHERE 1540512000 <= b.time < 1540598400
RETURN count(tx);

产生以下执行计划

--------------------------------------------+
| Operator          | Estimated Rows | Identifiers     | Other                                                                                                                                                                                                                                                                 |
+-------------------+----------------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +ProduceResults   |             12 | count(tx)       |                                                                                                                                                                                                                                                                       |
| |                 +----------------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +EagerAggregation |             12 | count(tx)       |                                                                                                                                                                                                                                                                       |
| |                 +----------------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +Filter           |            136 | anon[16], b, tx | AndedPropertyInequalities(Variable(b),Property(Variable(b),PropertyKeyName(time)),GreaterThanOrEqual(Property(Variable(b),PropertyKeyName(time)),Parameter(  AUTOINT2,Integer)), LessThan(Property(Variable(b),PropertyKeyName(time)),Parameter(  AUTOINT1,Integer))) |
| |                 +----------------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +Expand(All)      |           9052 | anon[16], b, tx | (tx)-[anon[16]:INCLUDED_IN]->(b)                                                                                                                                                                                                                                      |
| |                 +----------------+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| +NodeIndexSeek    |           9052 | tx              | :Transaction(pstype)                                                                                                                                                                                                                                                  |
+-------------------+----------------+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

执行速度太慢,因为第一个NodeIndexSeekByRange返回数千万个节点而不是9052。在b:Block(time)上使用NodeIndexSeekByRange将产生大约600个节点。

我尝试强制执行计划从b:Block(time)开始,但是它仍然在tx:Transaction(pstype)上继续使用NodeIndexSeek:

EXPLAIN MATCH (b:Block)<-[:INCLUDED_IN]-(tx:Transaction {pstype: 0})
USING INDEX b:Block(time)
WHERE 1540512000 <= b.time < 1540598400
RETURN count(tx);

产生

+-------------------------+----------------+-----------------+--------------------------------------------------------------+
| Operator                | Estimated Rows | Identifiers     | Other                                                        |
+-------------------------+----------------+-----------------+--------------------------------------------------------------+
| +ProduceResults         |             12 | count(tx)       |                                                              |
| |                       +----------------+-----------------+--------------------------------------------------------------+
| +EagerAggregation       |             12 | count(tx)       |                                                              |
| |                       +----------------+-----------------+--------------------------------------------------------------+
| +NodeHashJoin           |            136 | anon[16], b, tx | b                                                            |
| |\                      +----------------+-----------------+--------------------------------------------------------------+
| | +NodeIndexSeekByRange |          14703 | b               | :Block(time) >= {  AUTOINT2} AND :Block(time) < {  AUTOINT1} |
| |                       +----------------+-----------------+--------------------------------------------------------------+
| +Expand(All)            |           9052 | anon[16], b, tx | (tx)-[anon[16]:INCLUDED_IN]->(b)                             |
| |                       +----------------+-----------------+--------------------------------------------------------------+
| +NodeIndexSeek          |           9052 | tx              | :Transaction(pstype)                                         |
+-------------------------+----------------+-----------------+--------------------------------------------------------------+

让它快速运行的唯一方法是使用规则计划器:(速度快几个数量级)

CYPHER planner=rule MATCH (b:Block)
WHERE 1540512000 <= b.time < 1540598400
WITH b
MATCH (b)<-[:INCLUDED_IN]-(tx:Transaction {pstype: 0})
RETURN count(tx);

使用费用计划程序时是否可以使其正常工作?

:Block(time)和:Transaction(pstype)都被索引。

1 个答案:

答案 0 :(得分:0)

您可以尝试在tx上使用join hint以及索引提示,这样可以确保只从一个方向扩展:

EXPLAIN 
MATCH (b:Block)<-[:INCLUDED_IN]-(tx:Transaction {pstype: 0})
USING INDEX b:Block(time)
USING JOIN ON tx
WHERE 1540512000 <= b.time < 1540598400
RETURN count(tx);

或者,您可以稍微调整查询的结构,以使tx节点最初不是该模式的一部分,而是在WHERE子句中强制执行。您需要将MATCH拆分为2,但我认为您不需要任何计划者提示:

EXPLAIN 
MATCH (tx:Transaction {pstype: 0})
MATCH (b:Block)<-[:INCLUDED_IN]-(x)
WHERE 1540512000 <= b.time < 1540598400
AND x = tx
RETURN count(tx);

编辑

好的,那就让我们尝试另一种方法:

EXPLAIN 
MATCH (b:Block)<-[:INCLUDED_IN]-(x)
WHERE 1540512000 <= b.time < 1540598400
AND x.pstype = 0 // AND 'Transaction' in labels(x)
RETURN count(tx);

如果我们不使用标签,则无法使用索引查找。如果除:Transaction节点以外,还有其他具有pstype属性的节点,您可以尝试取消注释该行,在该行中,我们使用另一种方式查看该节点是否具有该标签(我认为这不会使用索引)查找,但不能完全确定)。

另一种方法(不确定是否可行)是使用模式理解(在找到与b的初始匹配项之后)从模式中获取结果列表,并对结果大小求和:

EXPLAIN 
MATCH (b:Block)
WHERE 1540512000 <= b.time < 1540598400
RETURN sum(size([(b)<-[:INCLUDED_IN]-(x:Transaction) WHERE x.pstype = 0 | x])) as count