休眠查询无法搜索超过2个字符的某些字符串,并且无法进行多字搜索

时间:2018-08-29 12:31:14

标签: elasticsearch lucene hibernate-search

我正在尝试通过休眠搜索实现全文搜索功能。我们需要搜索姓名,地址等。 用户可以搜索诸如名称“ John”,“ Johm Murphy”,“ Mark”,“ Mark L Thomas”和类似“ 20601 Blvd”,“一楼”等地址。

尽管当前逻辑适用于几个单词,并且可以搜索两个以上的字符,例如“ John”,但不能搜索“ Mark”,但是如果我说“ Ma”,那么我有结果,但是如果我写Mar或Mark,则不会给出任何记录。还能以哥伦比亚作为城市进行搜索。

多字搜索也不起作用。

当我不使用任何分析器时(如当前下面的代码所示),以上声明是有效的,如果我使用的是Edgengram,Text,标准分析器,那么我将获得不同的输出。但是,所有分析仪均无作用。 下面是完整的代码:

我要从中检索数据的索引结构:

  > {
>         "_index" : "client_master_index_0300",
>         "_type" : "com.csc.pt.svc.data.to.Basclt0300TO",
>         "_id" : "518,1",
>         "_score" : 4.0615783,
>         "_source" : {
>           "id" : "518,1",
>           "cltseqnum" : 518,
>           "addrseqnum" : "1",
>           "addrln1" : "Dba",
>           "addrln2" : "Betsy Evans",
>           "city" : "SDA",
>           "state" : "SC",
>           "zipcode" : "89756-4531",
>           "country" : "USA",
>           "basclt0100to" : {
>             "cltseqnum" : 518,
>             "clientname" : "Betsy Evans",
>             "longname" : "Betsy Evans",
>             "id" : "518"
>           },
>           "basclt0900to" : {
>             "cltseqnum" : 518,
>             "id" : "518"
>           }
>         }
>       }

同一索引的索引定义:

    {
>   "client_master_index_0300" : {
>     "aliases" : { },
>     "mappings" : {
>       "com.csc.pt.svc.data.to.Basclt0300TO" : {
>         "dynamic" : "strict",
>         "properties" : {
>           "addrln1" : {
>             "type" : "text",
>             "store" : true
>           },
>           "addrln2" : {
>             "type" : "text",
>             "store" : true
>           },
>           "addrln3" : {
>             "type" : "text",
>             "store" : true
>           },
>           "addrseqnum" : {
>             "type" : "text",
>             "store" : true
>           },
>           "basclt0100to" : {
>             "properties" : {
>               "clientname" : {
>                 "type" : "text",
>                 "store" : true
>               },
>               "cltseqnum" : {
>                 "type" : "long",
>                 "store" : true
>               },
>               "firstname" : {
>                 "type" : "text",
>                 "store" : true
>               },
>               "id" : {
>                 "type" : "keyword",
>                 "store" : true,
>                 "norms" : true
>               },
>               "longname" : {
>                 "type" : "text",
>                 "store" : true
>               },
>               "midname" : {
>                 "type" : "text",
>                 "store" : true
>               }
>             }
>           },
>           "basclt0900to" : {
>             "properties" : {
>               "cltseqnum" : {
>                 "type" : "long",
>                 "store" : true
>               },
>               "email1" : {
>                 "type" : "text",
>                 "store" : true
>               },
>               "id" : {
>                 "type" : "keyword",
>                 "store" : true,
>                 "norms" : true
>               }
>             }
>           },
>           "city" : {
>             "type" : "text",
>             "store" : true
>           },
>           "cltseqnum" : {
>             "type" : "long",
>             "store" : true
>           },
>           "country" : {
>             "type" : "text",
>             "store" : true
>           },
>           "id" : {
>             "type" : "keyword",
>             "store" : true
>           },
>           "state" : {
>             "type" : "text",
>             "store" : true
>           },
>           "zipcode" : {
>             "type" : "text",
>             "store" : true
>           }
>         }
>       }
>     },
>     "settings" : {
>       "index" : {
>         "creation_date" : "1535607176216",
>         "number_of_shards" : "5",
>         "number_of_replicas" : "1",
>         "uuid" : "x4R71LNCTBSyO9Taf8siOw",
>         "version" : {
>           "created" : "6030299"
>         },
>         "provided_name" : "client_master_index_0300"
>       }
>     }
>   }
> }

包含以下字段的Java对象:

    @Field(name = "longname", index = Index.YES, store = Store.YES,
            analyze = Analyze.YES)
    private String longname = "";

@Field(name = "firstname", index = Index.YES, store = Store.YES,
    analyze = Analyze.YES)
    private String firstname = "";

此外,当前我正在使用通配符上下文查询:

    public synchronized void searchClienData() {
   String lowerCasedSearchTerm = this.data.getSearchText().toLowerCase();

    SearchFactory searchFactory = fullTextSession.getSearchFactory();
    QueryBuilder buildQuery = searchFactory.buildQueryBuilder().forEntity(Basclt0300TO.class).get();

    String[] projections = {"basclt0100to.longname", "basclt0100to.cltseqnum", "addrln1", "addrln2", 
            "city","state","zipcode", "country","basclt0900to.email1" };

     Query query = queryBuilder.keyword()
    .onField("basclt0100to.longname").andField("addrln1").andField("addrln2")
    .andField("city").andField("state").andField("country").matching(lowerCasedSearchTerm)
    .createQuery();

    FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(query, Basclt0300TO.class);
    fullTextQuery.setMaxResults(this.data.getPageSize()).setFirstResult(this.data.getPageSize());

    List<String> projectedFields = new ArrayList<String>();
    for (String fieldName : projections)
            projectedFields.add(fieldName);

    @SuppressWarnings("unchecked")
    List<Cltj001ElasticSearchResponseTO> results = fullTextQuery.
    setProjection(projectedFields.toArray(new String[projectedFields.size()]))
    .setResultTransformer( new BasicTransformerAdapter() {
        private static final long serialVersionUID = 1L;
        @Override
        public Cltj001ElasticSearchResponseTO transformTuple(Object[] tuple, String[] aliases) {
            return   new Cltj001ElasticSearchResponseTO((String) tuple[0], (long) tuple[1],
                        (String) tuple[2], (String) tuple[3], (String) tuple[4],
                        (String) tuple[5],(String) tuple[6], (String) tuple[7], (String) tuple[8]);

        }
    })
    .getResultList();
    resultsClt0300MasterIndexList = results;

}

1 个答案:

答案 0 :(得分:0)

首先,您需要将分析器定义实际分配给字段。仅定义分析器是不够的。

@Field(name = "longname", index = Index.YES, store = Store.YES,
        analyze = Analyze.YES, analyzer = @Analyzer(definition = "theNameOfSomeAnalyzerDefinition"))
private String longname = "";

@Field(name = "firstname", index = Index.YES, store = Store.YES,
    analyze = Analyze.YES, analyzer = @Analyzer(definition = "theNameOfSomeAnalyzerDefinition"))
private String firstname = "";

然后,您需要选择一种策略并坚持下去:

  • 您都可以使用通配符查询,通配符查询易于使用,不需要EdgeNGram令牌过滤器,但会导致problems due to the query terms not being analyzed
  • 或者您将EdgeNGram令牌过滤器应用于字段,并在查询时:
    • 使用关键字查询不带通配符选项
    • override the analyzers使用不同的标识符,它们的定义应与分配给您的字段的分析器相同,但它们不应使用EdgeNGram令牌过滤器。

但是不要混用两种方法。决不。只是行不通。