Sparql-gremlin工具不使用索引

时间:2019-12-10 13:28:26

标签: sparql gremlin tinkerpop tinkerpop3 janusgraph

我将sparql-gremlin 3.4.0与janusgraph 0.3.1一起使用。在顶点属性“ iri”上创建索引后,gremlin查询会立即给出结果。相反,如果我在sparql中执行相同的查询,则它不使用任何索引。 在下面的示例中,我使用force-index选项来避免扫描查询。

gremlin_sparql_query

有什么建议吗?

1 个答案:

答案 0 :(得分:1)

可能要考虑两个问题:(1)TinkerPop没有正确地优化该查询,使其达到JanusGraph可以轻松使用索引的状态,或者(2)JanusGraph没有优化查询的某些方面以使用索引。对于后一种情况,JanusGraph必须很好地优化match()步骤以使用索引,因为这是sparql-gremlin在其翻译过程中使用的核心步骤。我很确定它不会那样做。说到前一种情况,JanusGraph可能依赖TinkerPop将match()转换为更容易使用的东西-在您的示例中,希望JanusGraph可以处理您编写的最初测试的查询-g.V().has('iri', ...)。我认为explain()会向您展示发生了什么,就像我用TinkerGraph测试您的示例变体时所做的那样:

gremlin> s.sparql("SELECT ?x WHERE { ?x v:name 'marko' }").explain()
==>Traversal Explanation
===========================================================================================================================================================================================
Original Traversal                 [InjectStep([SELECT ?x WHERE { ?x v:name 'marko' }])]

ConnectiveStrategy           [D]   [InjectStep([SELECT ?x WHERE { ?x v:name 'marko' }])]
SparqlStrategy               [D]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
MatchPredicateStrategy       [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
FilterRankingStrategy        [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
EarlyLimitStrategy           [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
InlineFilterStrategy         [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
IncidentToAdjacentStrategy   [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
AdjacentToIncidentStrategy   [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
RepeatUnrollStrategy         [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
CountStrategy                [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
PathRetractionStrategy       [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
LazyBarrierStrategy          [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
TinkerGraphCountStrategy     [P]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
TinkerGraphStepStrategy      [P]   [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
ProfileStrategy              [F]   [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]
StandardVerificationStrategy [V]   [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]

Final Traversal                    [TinkerGraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(x), PropertiesStep([name],value), IsStep(eq(marko)), MatchEndStep]]), SelectOneStep(last,x)]

不太好。

因此,解决方案的选项是:

  1. JanusGraph需要更好地优化match()来处理这种查询或
  2. TinkerPop标准遍历策略应更好地将此类查询转换为更通用的模式,或者
  3. sparql-gremlin应该编译为与现有遍历策略更清晰匹配的Gremlin

关于最后一点,请注意,如果sparql-gremlin生成了此match()查询,将会发生什么情况:

gremlin> g.V().match(__.as('a').has('person','name','marko')).select('a').values('name').explain()
==>Traversal Explanation
==========================================================================================================================================================================================================
Original Traversal                 [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
                                      alue)]

ConnectiveStrategy           [D]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
                                      alue)]
MatchPredicateStrategy       [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
                                      alue)]
FilterRankingStrategy        [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
                                      alue)]
EarlyLimitStrategy           [O]   [GraphStep(vertex,[]), MatchStep(AND,[[MatchStartStep(a), HasStep([~label.eq(person), name.eq(marko)]), MatchEndStep]]), SelectOneStep(last,a), PropertiesStep([name],v
                                      alue)]
InlineFilterStrategy         [O]   [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
IncidentToAdjacentStrategy   [O]   [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
AdjacentToIncidentStrategy   [O]   [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
RepeatUnrollStrategy         [O]   [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
CountStrategy                [O]   [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), PropertiesStep([name],value)]
PathRetractionStrategy       [O]   [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
LazyBarrierStrategy          [O]   [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
TinkerGraphCountStrategy     [P]   [GraphStep(vertex,[]), HasStep([~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
TinkerGraphStepStrategy      [P]   [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
ProfileStrategy              [F]   [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]
StandardVerificationStrategy [V]   [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]

Final Traversal                    [TinkerGraphStep(vertex,[~label.eq(person), name.eq(marko)])@[a], SelectOneStep(last,a), NoOpBarrierStep(2500), PropertiesStep([name],value)]

好多了。因此,我倾向于认为这是TinkerPop要解决的普遍问题,它涉及到最后两点的某种组合。当然,如果JanusGraph可以进一步优化match(),那就更好了。当然,这都不是解决您的问题的方法,但是它至少应该说明发生了什么以及问题在哪里。我创建了TINKERPOP-2325以便进行进一步的讨论和跟踪。