Question

我一直在玩一个非常小的 264,346顶点和 733,846边的图表。我使用BatchGraph和推荐的设置将其批量导入Titan。这很好 - 它效率惊人。

我现在正在使用图形遍历，并使用Java库在Java中实现了一个快速的Dijkstra算法。我应该提到我的Titan数据库正在运行的本地Cassandra节点 - 它不是嵌入式节点。

对于此图表中的平均路径，平均Dijkstra运行需要 1+分钟。这太长了。我希望在这样的图表上运行非常慢Dijkstra的时间不到10秒（对于此图表大小的内存图表平均查询将花费不到1秒）。

有效运行此类算法的最佳做法是什么？

我会给出简单的Dijkstra实现的部分内容，以防我访问顶点和边缘的方式不是最有效的方法。

获取图表实例：

TitanGraph graph = TitanFactory.open("cassandra:localhost");

Dijkstra实施的部分内容（特别涉及图访问）：

public int run(long src, long trg){
    this.pqueue.add(new PQNode(0, src));
    this.nodeMap.put(src, new NodeInfo(src, 0, -1));
    int dist = Integer.MAX_VALUE;
    while (!this.pqueue.isEmpty()){
        PQNode current = this.pqueue.poll();
        NodeInfo nodeInfo = this.nodeMap.get(current.getNodeId());
        long u = nodeInfo.getNodeId();

        if (u == trg) {
            dist = nodeInfo.getDistance();
            break;
        }

        if (nodeInfo.isSeen())
            continue;

        this.expansion++;

        TitanVertex vertex = graph.getVertex(u);
        for (TitanEdge out: vertex.getEdges()){
            Direction dir = out.getDirection(vertex);
            if (dir.compareTo(Direction.OUT) != 0 && dir.compareTo(Direction.BOTH) != 0){
                continue;
            }

            TitanVertex v = out.getOtherVertex(vertex);
            long vId = (long)v.getId();
            NodeInfo vInfo = this.nodeMap.get(vId);
            if (vInfo == null){
                vInfo = new NodeInfo(vId, Integer.MAX_VALUE, -1);
                this.nodeMap.put(vId, vInfo);
            }
            int weight = out.getProperty("weight");
            int currentDistance = nodeInfo.getDistance() + weight;
            if (currentDistance < vInfo.getDistance()){
                vInfo.setParent(u);
                vInfo.setDistance(currentDistance);
                this.pqueue.add(new PQNode(currentDistance, vId));
            }
        }
        nodeInfo.setSeen(true);
    }


    return dist;
}

我应该如何继续尝试在Titan上有效地执行此类算法？

Answer 1

忘记你的代码/算法，我必须同意，正如你所说，你的图表是非常小的＆＃34;。这就引出了一个问题，为什么你选择使用cassandra作为你的后端。对于那个大小的图表，我会尝试berkeleydb看看是否有帮助。或者，如果你想要真正极端，只需使用最快的蓝图实现，那就是：TinkerGraph！

至于代码示例本身，如果您不需要，可能会停止迭代顶点上的所有边：

for (TitanEdge out: vertex.getEdges()){
    Direction dir = out.getDirection(vertex);
    if (dir.compareTo(Direction.OUT) != 0 && dir.compareTo(Direction.BOTH) != 0){
            continue;
    }

相反，如果你只是想进入＆＃34;边告诉泰坦：vertex.getEdges(Direction.IN)。这应该会给你一些额外的效率 - 如果你决定加载到TinkerGraph，我甚至会这样做。这也可以消除对getOtherVertex行的需求，因为那时你只知道out.getVertex(Direction.OUT)我猜的更快。

加速泰坦图遍历

1 个答案: