Hibernate Search查询从实际数据库而不是从弹性数据库索引中获取数据

时间:2018-08-10 11:22:31

标签: elasticsearch lucene hibernate-search

我们正在尝试在我们的项目中实施弹性搜索。到目前为止,我们已经能够在ES下创建索引。但是问题在于检索时。当我们触发查询以检索数据时,查询将在实际数据库而不是ES DB索引上触发。

hibernate.cfg

<property name="hibernate.search.default.indexmanager">elasticsearch</property>
<property name="hibernate.search.default.elasticsearch.host">http://127.0.0.1:9200</property>
<property name="hibernate.search.default.elasticsearch.index_schema_management_strategy">drop-and-create</property>
<property name="hibernate.search.default.elasticsearch.required_index_status">yellow</property>

要搜索的代码:

 Session session = HibernateSessionFactory.current().getSession("");
      fullTextSession = Search.getFullTextSession(session.getSession());
      searchFactory = fullTextSession.getSearchFactory();

QueryBuilder titleQB = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(<MyClassHere>.class).get();

    Query query = titleQB.phrase().onField(EMAIL1_EDGE_NGRAM_INDEX).andField(EMAIL1_NGRAM_INDEX)
            .boostedTo(5).sentence(searchTerm.toLowerCase()).createQuery();

    FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(query, <MyClassHere>.class);
    fullTextQuery.setMaxResults(20);

    List<Bascltj001TO> results = fullTextQuery.getResultList();
    return results;

实体类:

@Entity
@Indexed
public class MYClass {
    private DBAccessStatus dBAccessStatus;
    private String optname = "";
    private String phone1 = "";
    @Fields({
          @Field(name = "email1", index = Index.YES, store = Store.YES,
        analyze = Analyze.YES, analyzer = @Analyzer(definition = "standardAnalyzer")),
          @Field(name = "edgeNGramEmail1", index = Index.YES, store = Store.NO,
        analyze = Analyze.YES, analyzer = @Analyzer(definition = "autocompleteEdgeAnalyzer")),
          @Field(name = "nGramEmail1", index = Index.YES, store = Store.NO,
        analyze = Analyze.YES, analyzer = @Analyzer(definition = "autocompleteNGramAnalyzer"))
        })
    private String email1 = "";

弹性DB json数据

{
        "_index" : "myclass",
        "_type" : "myclass",
        "_id" : "67",
        "_score" : 1.0,
        "_source" : {
          "id" : "67",
          "cltseqnum" : 67,
          "email1" : "email@clt.com",
          "edgeNGramEmail1" : "email@clt.com",
          "nGramEmail1" : "email@clt.com"
        }

分析仪定义

@AnalyzerDefs({

        @AnalyzerDef(name = "autocompleteEdgeAnalyzer",

// Split input into tokens according to tokenizer
                tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),

                filters = {
                        // Normalize token text to lowercase, as the user is unlikely to
                        // care about casing when searching for matches
                        @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                                @Parameter(name = "pattern", value = "([^a-zA-Z0-9\\.])"),
                                @Parameter(name = "replacement", value = " "),
                                @Parameter(name = "replace", value = "all") }),
                        @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                        @TokenFilterDef(factory = StopFilterFactory.class),
                        // Index partial words starting at the front, so we can provide
                        // Autocomplete functionality
                        @TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = {
                                @Parameter(name = "minGramSize", value = "3"),
                                @Parameter(name = "maxGramSize", value = "50") }) }),

        @AnalyzerDef(name = "autocompleteNGramAnalyzer",

// Split input into tokens according to tokenizer
                tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),

                filters = {
                        // Normalize token text to lowercase, as the user is unlikely to
                        // care about casing when searching for matches
                        @TokenFilterDef(factory = WordDelimiterFilterFactory.class),
                        @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                        @TokenFilterDef(factory = NGramFilterFactory.class, params = {
                                @Parameter(name = "minGramSize", value = "3"),
                                @Parameter(name = "maxGramSize", value = "5") }),
                        @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                                @Parameter(name = "pattern", value = "([^a-zA-Z0-9\\.])"),
                                @Parameter(name = "replacement", value = " "),
                                @Parameter(name = "replace", value = "all") }) }),

2 个答案:

答案 0 :(得分:0)

您的查询是全文查询,因此将在Elasticsearch集群上执行-Hibernate Search无法将其转换为数据库查询。

但是...请记住,您的索引不包括构建实体所需的所有必要数据。因此,从Elasticsearch集群获取ID后,Hibernate Search将对您的数据库执行查询以将结果作为实体获取。

避免这种情况的唯一方法是使用投影查询索引的特定字段,但是除了非常特殊的情况外,通常还需要获取实体。

答案 1 :(得分:0)

我的猜测是,您看到它在都是上运行查询。当然,它不会只在数据库上运行查询,因为不可能在数据库上运行这样的全文查询。

默认体系结构

FullTextQuery 的默认设置是在Elasticsearch上运行查询,以便知道匹配的对象的主键,然后使用此ID列表从列表中加载完全托管的域对象。数据库。

通常这是人们想要的,因此请确保获取最新版本的数据,并确保在事务的安全范围内加载对象。

它还允许您将更改应用于对象,并在提交事务时将这些更改应用于数据库和Elasticsearch集群。

另一个好处是您可以从数据库中加载所有字段,包括未索引的字段。因此,您可以跳过对您严格不需要运行查询的所有字段的索引编制-这样可使索引轻巧快速。

替代选项

如果出于任何原因您只想仅对Elasticsearch执行查询,则只需要使用投影即可。

请参阅: