Apache Lucene QueryParser.parse未在FuzzyQuery上使用分析器

时间:2019-01-16 22:26:37

标签: java elasticsearch search lucene information-retrieval

使用TermQuery和PhraseQuery调用我的SerbianAnalyzer,但不使用FuzzyQuery调用。我尝试使用具有相同行为的lucene 4和lucene 7。 我有以下代码:

Query query;
String field = "text";
String value = "дањ";

QueryParser queryParser = new QueryParser(field, new SerbianAnalyzer());

System.out.println("\nTermQuery");
query = new TermQuery(new Term(field, value));
System.out.println("Query (preParse): " + (TermQuery)query);
System.out.println("Query.toString(field1): " + ((TermQuery)query).toString(field));
System.out.println("Query (afterParse): " + queryParser.parse(((TermQuery)query).toString(field)));

System.out.println("\nPhraseQuery");
String[] terms = value.split(" ");
query = new PhraseQuery(field, terms);
System.out.println("Query (preParse): " + ((PhraseQuery)query));
System.out.println("Query.toString(field1): " + ((PhraseQuery)query).toString(field));
System.out.println("Query (afterParse): " + queryParser.parse(((PhraseQuery)query).toString(field)));

System.out.println("\nFuzzyQuery");
query = new FuzzyQuery(new Term(field, value), 1);
System.out.println("Query (preParse): " + ((FuzzyQuery)query));
System.out.println("Query.toString(field1): " + ((FuzzyQuery)query).toString(field));
System.out.println("Query (afterParse): " + queryParser.parse(((FuzzyQuery)query).toString(field)));

我得到的结果是:

TermQuery Query (preParse): text:дањ  
Query.toString(field): дањ 
Query (afterParse): text:danj

PhraseQuery Query (preParse): text:"дањ"  
Query.toString(field): "дањ" 
Query (afterParse): text:danj

FuzzyQuery Query (preParse): text:дањ~1  
Query.toString(field): дањ~1 
Query (afterParse): text:дањ~1

1 个答案:

答案 0 :(得分:0)

问题是,很长一段时间以来,如果 FuzzyQuery WildcardQuery << QueryParser 都无法正确解析查询(没有应用分析器) / em>, PrefixQuery RegexpQuery

为解决此问题,Lucene具有AnalyzingQueryParser类,该类重写Lucene的默认QueryParser,以便Fuzzy-,Prefix-,Range-和WildcardQuerys也通过给定的分析器传递,但通配符*和?不要从搜索字词中删除。

但是,从Lucene 7.4开始,此功能被merged引入 QueryParserBase 中,该功能现在具有处理这些查询的适当方法,例如:

protected Query getFuzzyQuery(String field,
                              String termStr,
                              float minSimilarity)

因此,您应该创建覆盖该方法的QueryParser并从此处调用解析,而不是创建类ComplexPhraseQueryParser