如何将年份的最后两位数添加到休眠搜索/ lucene索引

时间:2014-09-01 16:06:28

标签: lucene hibernate-search

在我的数据库中,我以完整的形式存储了多年。示例,2012,2013,2014等。这也是它们存储在索引中的方式。我也想在索引中存储最后两位数字。示例12,13,14等我基本上希望个人能够在2012年和12年进行关键字搜索。

我的主搜索分析器看起来像这样。

@AnalyzerDefs({
    @AnalyzerDef(name = "searchtokenanalyzer",
            // Split input into tokens according to tokenizer
            tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "([^a-zA-Z0-9\\-])"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")}),
                @TokenFilterDef(factory = StopFilterFactory.class),
                @TokenFilterDef(factory = TrimFilterFactory.class)
            }),

我有第二个处理年份缩写的分析器,看起来像这样。

@AnalyzerDef(name = "yearanalyzer",
            // Split input into tokens according to tokenizer
            // Split input into tokens according to tokenizer
            tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "^.{2}"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")}),
                @TokenFilterDef(factory = StopFilterFactory.class),
                @TokenFilterDef(factory = TrimFilterFactory.class)
            })

在我的实体领域,我有以下内容。

@Entity
@Indexed
public class YearLookup 
    @Fields({
            @Field(name = "name", store = Store.NO, index = Index.YES,
                    analyze = Analyze.YES, analyzer = @Analyzer(definition = "searchtokenanalyzer")),
            @Field(name = "abbr", store = Store.NO, index = Index.YES, 
                    analyze = Analyze.YES, analyzer = @Analyzer(definition = "yearanalyzer"))
        })
        private String name;
    }

到目前为止,所有内容都正确地在索引中生成,我可以看到

name 2012,2013,2014
abbr 12,13,14

现在,我使用以下代码对YearLookup.class进行搜索。 abbr年份再次减少两位数,创造一个空值,而名字仍然保持完好。

public interface SearchParam {
    public static final String[] SEARCH_FIELDS = new String[]{"yearLookup.name", "yearLookup.abbr"};
}

String searchString = "14";

QueryBuilder queryBuilder = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(YearLookup.class).get();

ermMatchingContext onWildCardFields = queryBuilder.keyword().wildcard().onField(SearchParam.SEARCH_FIELDS[0]);
            TermMatchingContext onFuzzyFields = queryBuilder.keyword().fuzzy().withThreshold(0.7f)
                    .withPrefixLength(1).onField(SearchParam.SEARCH_FIELDS[0]);

            //Iterate over all the remaining search fields stored in the "VehicleListing" index 
            for (int i = 1; i < SearchParam.SEARCH_FIELDS.length; i++) {
                onWildCardFields.andField(SearchParam.SEARCH_FIELDS[i]);
                onFuzzyFields.andField(SearchParam.SEARCH_FIELDS[i]);
            }

            String[] tokens = searchString.toLowerCase().split("\\s");

            for (String token : tokens) {
                luceneQuery = queryBuilder.bool()
                        .should(onWildCardFields.matching(token + "*").createQuery())
                        .should(onFuzzyFields.matching(token).createQuery())
                        .createQuery();
            }

FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(luceneQuery, YearLookup.class);

Integer results = fullTextQuery.getResultSize();

现在,当我针对此运​​行我的测试用例时。我得到以下异常。

HSEARCH000146:在字段'yearLookup.abbr'上应用的查询字符串'14'没有有意义的令牌匹配。根据在此字段上应用的Analyzer验证查询输入。 org.hibernate.search.errors.EmptyQueryException     at org.hibernate.search.query.dsl.impl.ConnectedMultiFieldsTermQueryBuilder.createQuery(ConnectedMultiFieldsTermQueryBuilder.java:111)     at org.hibernate.search.query.dsl.impl.ConnectedMultiFieldsTermQueryBuilder.createQuery(ConnectedMultiFieldsTermQueryBuilder.java:86)     在com.domain.auto.services.search.impl.SearchManagerImpl.doSearch(SearchManagerImpl.java:146)     在$ SearchManager_138fdc525111b303.doSearch(未知来源)     在$ SearchManager_138fdc525111b2f3.doSearch(未知来源)     在com.domain.auto.services.search.impl.SearchServiceImplTest.testYearSearch(SearchServiceImplTest.java:92)

有人有任何想法吗?

2 个答案:

答案 0 :(得分:0)

解决方案

@AnalyzerDef(name = "yearanalyzer",
        // Split input into tokens according to tokenizer
        // Split input into tokens according to tokenizer
        tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
        filters = {
            @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                @Parameter(name = "pattern", value = "^\\d{2}(\\d{2})$"),
                @Parameter(name = "replacement", value = "$1"),
                @Parameter(name = "replace", value = "all")}),
        })

答案 1 :(得分:0)

为两种情况创建一个桥接器并处理String,如下所示:

 @FieldBridge(impl = YearFieldBridge.class)
 private String name;

并创建一个类似于similer的桥类:

public class YearFieldBridge implements StringBridge, Serializable {
    private static final long serialVersionUID = 1L;
    @Override
    public String objectToString(Object value) {
        if(value != null) {
            if(value instanceof String) {
                String strVal = (String) value;
                strVal = strVal.toUpperCase();
                if(strVal.length() == 2){
                    return "20"+strVal;
                }else{
                    return strVal;
                }
            }
        }
        return null;
    }
}