Question

我有一个neo4j图，由Serie作为节点和EDGE作为关系组成。我有一个能够计算两个节点之间的allShortestaPaths的查询。

MATCH (serie1:Serie {serie_id: 'id1'}),
      (serie2:Serie {serie_id: 'id2'}), 
      p = allShortestPaths((serie1)-[EDGE*..6]-(serie2)) 
RETURN p as shortestPath

我的应用程序正在进行多次迭代，在每次迭代时，它使用两个不同的节点（serie1，serie2）多次执行查询，然后在图形上写入一些新的EDGES。

第一次迭代（20次查询）执行速度非常快，但在第二次迭代时，响应时间开始增加，每次执行时查询时间超过3分钟。

我已经在serie_id属性上创建了索引，并且我还增加了堆空间直到8GB的空间，并且页面缓存大小调整也已经启用了足够的空间。

我也在调查查询是否可以通过其他方式重新编写，但似乎这是更好的方式。

我猜这个问题与执行次数有关，但不确定如何优化它。

Answer 1

我的应用程序在每次迭代时进行多次迭代使用两个不同的节点多次执行查询（serie1，serie2），然后在图表上写下一些新的EDGES。

如果新写入影响迭代期间匹配的paths或nodes，则可能会遇到查询采样，在x次后将查询标记为失效。

如果您看到以下行，我建议您检查logs/debug.log文件：

2016-09-22 09:32:39.178+0000 INFO  [o.n.k.i.a.i.s.OnlineIndexSamplingJob] Sampled index :WithIndex(id) with 1001 unique values in sample of avg size 1001 taken from index containing 1001 entries
2016-09-22 09:32:49.179+0000 INFO  [o.n.k.i.a.i.s.OnlineIndexSamplingJob] Sampled index :WithIndex(id) with 1001 unique values in sample of avg size 1001 taken from index containing 1001 entries

AND

2016-09-22 07:26:01.359+0000 INFO  [o.n.c.i.ExecutionEngine] Discarded stale query from the query cache: CREATE (n:Node {id: {i} })
2016-09-22 07:26:01.361+0000 INFO  [o.n.c.i.CypherCompiler] Discarded stale query from the query cache: CREATE (n:Node {id: {i} })
2016-09-22 07:26:10.403+0000 INFO  [o.n.c.i.ExecutionEngine] Discarded stale query from the query cache: CREATE (n:Node {id: {i} })
2016-09-22 07:26:10.404+0000 INFO  [o.n.c.i.CypherCompiler] Discarded stale query from the query cache: CREATE (n:Node {id: {i} })

当然，这些行反映了我在我的应用程序中执行的查询，而您的查询应该出现。您还可以在这些日志中看到GC暂停。

您可以调整查询被视为陈旧的阈值。该设置如下所述：

https://neo4j.com/docs/operations-manual/current/reference/#config_cypher.statistics_divergence_threshold

在这里

https://neo4j.com/docs/operations-manual/current/reference/#config_cypher.min_replan_interval

没有开箱即用的值，我会将第一个值增加到0.8，但这并不容易，因为它会影响其他查询。

NEO4J性能多次查询

1 个答案: