Question

我们使用Titan和Persistit作为后端，用于具有大约100.000个顶点的图形。我们的用例非常复杂，但目前的问题可以通过一个简单的例子来说明。我们假设我们在图表中存储图书和作者。每个Book顶点都有一个ISBN编号，对于整个图形而言是唯一的。

我需要回答以下问题： 向我提供图表中所有图书的ISBN编号。

目前，我们这样做：

// retrieve graph instance
TitanGraph graph = getGraph(); 
// Start a Gremlin query (I omit the generics for brevity here)
GremlinPipeline gremlin = new GremlinPipeline().start(graph);
// get all vertices in the graph which represent books (we have author vertices, too!)
gremlin.V("type", "BOOK");
// the ISBN numbers are unique, so we use a Set here
Set<String> isbnNumbers = new HashSet<String>();
// iterate over the gremlin result and retrieve the vertex property
while(gremlin.hasNext()){
    Vertex v = gremlin.next();
    isbnNumbers.add(v.getProperty("ISBN"));
}
return isbnNumbers;

我的问题是：有更聪明的方法可以更快地完成这项工作吗？我是Gremlin的新手，所以很可能我在这里做了一些非常愚蠢的事情。查询目前需要2.5秒，这不是太糟糕，但如果可能的话我想加快速度。请将后端视为已修复。

Answer 1

我怀疑有一种更快的方式（你总是需要迭代所有的书顶点），但是使用groovy / gremlin可以为你的任务提供一个不那么详细的解决方案。在sample graph您可以运行，例如以下查询：

gremlin> namesOfJaveProjs = []; g.V('lang','java').name.store(namesOfJaveProjs)
gremlin> namesOfJaveProjs
==>lop
==>ripple

或者您的图书图表：

isbnNumbers = []; g.V('type','BOOK').ISBN.store(isbnNumbers)

如何加速Titan DB中的“全局”查询？

1 个答案: