Question

我目前正在将一些旧的Neo4j相关代码迁移到新的Neo4j 2.0.0测试版。我认为新的模式索引在一系列情况下是一个很好的功能，所以我想更改我的代码以尽可能使用它们。但在此之前，我想，我想确保我的表现不会变差。所以我写了一个小测试。令人惊讶的是，在查找时，模式索引的性能始终比遗留索引差。但在得出结论之前，我想与您分享我的测试，以便您可以告诉我我是否做了非法的事情，或者由于测试用例的简单性或类似问题，结果只是这样。此外，您可以自己尝试并确认/拒绝我的观察。因为现在看来，我宁愿坚持使用遗留索引，它们甚至在Java代码中使用时也有一些更好的属性（你不能创建两个具有相同名称的索引，但只会返回现有的索引，其中一个模式索引会抛出一个异常，在索引搜索/获取结果你有“.single（）”方法，其中模式索引我似乎必须使用迭代器...）

在我的代码下面。我只是通过注释掉对一种索引（遗留或模式）的调用来测试，然后运行整个事情几次。我尝试使用N的各种值，从这里显示的1000到60000，总是具有相同的相对结果，即遗留索引执行显着更快的查找。显然，我的用例是很多节点，每个节点都有一个唯一的ID，我需要尽可能快地查找整个节点范围，我只有节点的ID。

我的问题是：遗留索引确实更快，如果这对我来说是一个主要问题，或者我做错了什么或者这是一个已知问题并且将在测试期间得到解决并且将在发布？谢谢！

import java.io.File;
import java.io.IOException;
import java.util.concurrent.TimeUnit;

import org.apache.commons.io.FileUtils;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Label;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.ResourceIterator;
import org.neo4j.graphdb.Transaction;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.graphdb.index.Index;
import org.neo4j.graphdb.schema.IndexDefinition;
import org.neo4j.graphdb.schema.Schema;
import org.neo4j.tooling.GlobalGraphOperations;

enum labels implements Label {
    term
}

public class Neo4jIndexPerformanceTest {
    private static int N = 1000;

    public static void main(String[] args) throws IOException {
        FileUtils.deleteDirectory(new File("tmp/graph.db"));
        GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabase("tmp/graph.db");
        try (Transaction tx = graphDb.beginTx()) {
            int i = 0;
            for (Node n : GlobalGraphOperations.at(graphDb).getAllNodes())
                i++;
            System.out.println("Number of nodes: " + i);
        }
//      createLegacyIndex(graphDb);
//      searchLegacyIndex(graphDb);
        createSchemaIndex(graphDb);
        searchSchemaIndex(graphDb);
        graphDb.shutdown();
    }

    private static void searchSchemaIndex(GraphDatabaseService graphDb) {
        try (Transaction tx = graphDb.beginTx()) {
            IndexDefinition index = graphDb.schema().getIndexes(labels.term).iterator().next();
            graphDb.schema().awaitIndexOnline(index, 10, TimeUnit.SECONDS);
        }
        long time = System.currentTimeMillis();
        try (Transaction tx = graphDb.beginTx()) {
            for (int i = 0; i < N; i++) {
                ResourceIterator<Node> iterator = graphDb.findNodesByLabelAndProperty(labels.term, "id", "schema:" + i).iterator();
                if (iterator.hasNext()) {
                    Node n = iterator.next();
                } 
                iterator.close();
            }
        }
        time = System.currentTimeMillis() - time;
        System.out.println("Searching schema index took: " + time + " ms");
    }

    private static void searchLegacyIndex(GraphDatabaseService graphDb) {
        long time = System.currentTimeMillis();
        try (Transaction tx = graphDb.beginTx()) {
            Index<Node> index = graphDb.index().forNodes("terms");
            for (int i = 0; i < N; i++) {
                ResourceIterator<Node> iterator = index.get("id", "legacy:" + i).iterator();
                if (iterator.hasNext()) {
                    Node single = iterator.next();
                }
                iterator.close();
                // if (single == null)
                // throw new IllegalStateException();
            }
        }
        time = System.currentTimeMillis() - time;
        System.out.println("Searching legacy index took: " + time + " ms");

    }

    private static void createSchemaIndex(GraphDatabaseService graphDb) {
        Schema schema = null;
        try (Transaction tx = graphDb.beginTx()) {
            schema = graphDb.schema();
            boolean e = false;
            for (IndexDefinition id : graphDb.schema().getIndexes()) {
                e = true;
            }
            if (!e)
                schema.indexFor(labels.term).on("id").create();
            tx.success();
        }
        try (Transaction tx = graphDb.beginTx()) {
            long time = System.currentTimeMillis();

            for (int i = 0; i < N; i++) {
                Node n = graphDb.createNode(labels.term);
                n.setProperty("id", "schema:" + i);
            }

            time = System.currentTimeMillis() - time;
            schema.awaitIndexesOnline(10, TimeUnit.SECONDS);
            tx.success();
            System.out.println("Creating schema index took: " + time + " ms");
        }
    }

    private static void createLegacyIndex(GraphDatabaseService graphDb) {
        try (Transaction tx = graphDb.beginTx()) {
            Index<Node> index = graphDb.index().forNodes("terms");

            long time = System.currentTimeMillis();

            for (int i = 0; i < N; i++) {
                Node n = graphDb.createNode(labels.term);
                n.setProperty("id", "legacy:" + i);
                index.add(n, "id", n.getProperty("id"));
            }

            time = System.currentTimeMillis() - time;
            tx.success();
            System.out.println("Creating legacy index took: " + time + " ms");
        }
    }
}

Answer 1

我尝试了你的代码，而且架构索引的实现确实没有传统的快。但我找到了原因，这是围绕索引实现的一个简单错误，而不是索引本身。我尝试在本地修复这些错误，它们正是高性能，遗留和模式索引。

所以这是一个正确修复的问题，我只能希望它能够进入2.0版本。

Neo4j：架构索引查找比传统索引慢？

1 个答案: