我有以下密码查询:
CALL apoc.index.nodes('node_auto_index','pref_label:(Foo)')
YIELD node, weight
WHERE node.corpus = 'my_corpus'
WITH node, weight
MATCH (selected:ontoterm{corpus:'my_corpus'})-[:spotted_in]->(:WEBSITE)<-[:spotted_in]-(node:ontoterm{corpus:'my_corpus'})
WHERE selected.uri = 'http://uri1'
OR selected.uri = 'http://uri2'
OR selected.uri = 'http://uri3'
RETURN DISTINCT node, weight
ORDER BY weight DESC LIMIT 10
第一部分(直到WITH)运行速度非常快(Lucene遗留索引)并返回~100个节点。 uri属性也是唯一的(选择= 3个节点) 我有~300个WEBSITE节点。执行时间为48749毫秒。
如何重构查询以提高性能?为什么配置文件中有~13.8 Mio行?
答案 0 :(得分:1)
我认为问题出在WITH子句中,这扩大了结果的范围。 InverseFalcon的回答使查询更快:49 - &gt; 18秒(但仍然不够快)。为了避免巨大的扩展我收集了网站。以下查询需要60毫秒
MATCH (selected:ontoterm)-[:spotted_in]->(w:WEBSITE)
WHERE selected.uri in ['http://avgl.net/carbon_terms/Faser', 'http://avgl.net/carbon_terms/Carbon', 'http://avgl.net/carbon_terms/Leichtbau']
AND selected.corpus = 'carbon_terms'
with collect(distinct(w)) as websites
CALL apoc.index.nodes('node_auto_index','pref_label:(Fas OR Fas*)^10 OR pref_label_deco:(Fas OR Fas*)^3 OR alt_label:(Fa)^5') YIELD node, weight
WHERE node.corpus = 'carbon_terms' AND node:ontoterm
WITH websites, node, weight
match (node)-[:spotted_in]->(w:WEBSITE)
where w in websites
return node, weight
ORDER BY weight DESC
LIMIT 10
答案 1 :(得分:0)
我没有在您的计划中看到任何NodeUniqueIndexSeek,因此selected
节点无法有效查找。
确保您对ontoterm(uri)有一个独特的约束。
在唯一约束启动后,尝试一下:
PROFILE CALL apoc.index.nodes('node_auto_index','pref_label:(Foo)')
YIELD node, weight
WHERE node.corpus = 'my_corpus' AND node:ontoterm
WITH node, weight
MATCH (selected:ontoterm)
WHERE selected.uri in ['http://uri1', 'http://uri2', 'http://uri3']
AND selected.corpus = 'my_corpus'
WITH node, weight, selected
MATCH (selected)-[:spotted_in]->(:WEBSITE)<-[:spotted_in]-(node)
RETURN DISTINCT node, weight
ORDER BY weight DESC LIMIT 10
查看查询计划。你应该在那里的某个地方看到一个NodeUniqueIndexSeek,希望你看到db命中率下降。