使用自定义Lucene Analyzer

时间:2016-07-11 14:29:05

标签: neo4j lucene spring-data-neo4j spring-data-neo4j-4 neo4j-ogm

在我的Neo4j 3.0.3和SDN 4.2.0.BUILD-SNAPSHOT中,我创建了以下配置:

@Configuration
@EnableNeo4jRepositories(basePackages = "com.example")
@EnableTransactionManagement
public class Neo4jTestConfig extends Neo4jConfiguration {

    private static final String NEO4J_EMBEDDED_DATABASE_PATH_PROPERTY = "neo4j.embedded.database.path";

    @PostConstruct
    public void init() {
        Components.setDriver(new EmbeddedDriver(graphDatabaseService()));

        EmbeddedDriver embeddedDriver = (EmbeddedDriver) Components.driver();
        GraphDatabaseService databaseService = embeddedDriver.getGraphDatabaseService();
        try (Transaction t = databaseService.beginTx()) {
            Index<Node> autoIndex = databaseService.index().forNodes("node_auto_index");
            databaseService.index().setConfiguration(autoIndex, "type", "fulltext");
            databaseService.index().setConfiguration(autoIndex, "to_lower_case", "true");
            databaseService.index().setConfiguration(autoIndex, "analyzer", StandardAnalyzerV36.class.getName());
            t.success();
        }
    }

    public GraphDatabaseService graphDatabaseService() {

        // @formatter:off
        GraphDatabaseService graphDatabaseService = new GraphDatabaseFactory()
                .newEmbeddedDatabaseBuilder(new File(environment.getProperty(NEO4J_EMBEDDED_DATABASE_PATH_PROPERTY)))       
                .setConfig(GraphDatabaseSettings.node_keys_indexable, "name,description")
                .setConfig(GraphDatabaseSettings.node_auto_indexing, "true").
                newGraphDatabase();         
        // @formatter:on        

        return graphDatabaseService;
    }

...

}

另外,我已经实现了StandardAnalyzerV36:

public final class StandardAnalyzerV36 extends Analyzer {

    public static final CharArraySet STOP_WORDS_SET = StopAnalyzer.ENGLISH_STOP_WORDS_SET;  

    @Override
    protected TokenStreamComponents createComponents(String fieldName) {

        final ClassicTokenizer src = new ClassicTokenizer();
        TokenStream tok = new StandardFilter(src);
        tok = new StopFilter(new LowerCaseFilter(tok), STOP_WORDS_SET);

        return new TokenStreamComponents(src, tok);
    }

    @Override
    protected Reader initReader(String fieldName, Reader reader) {
        return new HTMLStripCharFilter(reader);
    }   
...
}

现在我的测试失败了,因为StandardAnalyzerV36看起来没有正确应用于索引。

我做错了什么以及如何解决?

已更新

我尝试使用ogm.properties配置驱动程序:

driver=org.neo4j.ogm.drivers.embedded.driver.EmbeddedDriver

dbms.auto_index.nodes.keys=name,description
dbms.auto_index.nodes.enabled=true

和config:

@Configuration
@EnableNeo4jRepositories(basePackages = "com.example")
@EnableTransactionManagement
public class Neo4jTestConfig extends Neo4jConfiguration {

    @Resource
    private Environment environment;

    @PostConstruct
    public void init() {
        EmbeddedDriver embeddedDriver = (EmbeddedDriver) Components.driver();
        GraphDatabaseService databaseService = embeddedDriver.getGraphDatabaseService();

        try (Transaction t = databaseService.beginTx()) {
            Index<Node> autoIndex = databaseService.index().forNodes("node_auto_index");
            databaseService.index().setConfiguration(autoIndex, "type", "fulltext");
            databaseService.index().setConfiguration(autoIndex, "to_lower_case", "true");
            databaseService.index().setConfiguration(autoIndex, "analyzer", StandardAnalyzerV36.class.getName());
            t.success();
        }
    }

    @Override
    public SessionFactory getSessionFactory() {
        return new SessionFactory("com.example");
    }

}

但它仍然无效。

ogm.properties应该放置哪些设置?

1 个答案:

答案 0 :(得分:2)

我认为您不应该创建新的嵌入式驱动程序:Components.setDriver(new EmbeddedDriver(graphDatabaseService()));

相反,从已配置的驱动程序中获取GraphDatabaseService并使用它?

     EmbeddedDriver embeddedDriver = (EmbeddedDriver) Components.driver();
     GraphDatabaseService databaseService = embeddedDriver.getGraphDatabaseService();
     try (Transaction t = databaseService.beginTx()) {
      Index<Node> autoIndex = databaseService.index().forNodes("node_auto_index");
      databaseService.index().setConfiguration(autoIndex, "type", "fulltext");
      databaseService.index().setConfiguration(autoIndex, "to_lower_case", "true");
      databaseService.index().setConfiguration(autoIndex, "analyzer", StandardAnalyzerV36.class.getName());
      t.success();
     }

如果要以编程方式配置嵌入式驱动程序,则可以在SDN中将其设置为这样

@Bean
public Configuration getConfiguration() {
   Configuration config = new Configuration();
   config
       .driverConfiguration()
       .setDriverClassName("org.neo4j.ogm.drivers.embedded.driver.EmbeddedDriver")
       .setURI("file:///var/tmp/graph.db");
   return config;
}

@Bean
public SessionFactory getSessionFactory() {
    return new SessionFactory(getConfiguration(), <packages> );
}

或仅仅与OGM相似:

Configuration configuration = new Configuration()
             .driverConfiguration()
             .setDriverClassName("org.neo4j.ogm.drivers.embedded.driver.EmbeddedDriver")
             .setURI(uri);

然后使用上面相同的方法获取底层图形数据库的句柄。

<强>更新: ogm.properties不支持dbms.auto_index.nodes.keys或dbms.auto_index.nodes.enabled。这些将在创建GraphDatabaseService时进行配置,无法使用当前版本的OGM执行此操作。见https://github.com/neo4j/neo4j-ogm/issues/134