在lucene中正确使用布尔逻辑

时间:2016-09-20 16:00:15

标签: java lucene

我为这个问题道歉,但这让我有些困惑。

首先,我有一组地址对象,我试图用一个查询来找到相关的对象(在伪代码中)看起来像这样

SELECT
WHERE
     Fuzzy(addr1, "address line 1) // = true
AND
     (Fuzzy(addr2, "address line 2") OR
      Fuzzy(addrcity, "address city") OR
      //all the other address fields
     )

基本上我想带回至少地址第一行大致匹配的所有实体,地址的其他部分之一也有模糊匹配。

我已经通过此查询验证了数据存在:

Query toRun = new FuzzyQuery(new Term("addr1", getLineOne()));

返回包含所有正确字段的文档。

我的代码如下:

public List<Address> search() {
    List<Address> results = new ArrayList<>();

    BooleanQuery.Builder queryBuilder = new BooleanQuery.Builder();
    queryBuilder.setMinimumNumberShouldMatch(2);

    BooleanQuery.Builder subQueryBuilder = new BooleanQuery.Builder();
    subQueryBuilder.setMinimumNumberShouldMatch(1);

    if(!getLineOne().equals("")) {
        Query query = new FuzzyQuery(new Term("addr1", getLineOne()));
        queryBuilder.add(query, BooleanClause.Occur.MUST);
    }

    if(!getLineTwo().equals("")) {
        Query query = new FuzzyQuery(new Term("addr2", getLineTwo()));
        subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }
    if(!getCity().equals("")) {
        Query query = new FuzzyQuery(new Term("addrcity", getCity()));
        subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }
    if(!getCounty().equals("")) {
        Query query = new FuzzyQuery(new Term("addrcounty", getCounty()));
        subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }
    if(!getCountry().equals("")) {
        Query query = new FuzzyQuery(new Term("addrcountry", getCountry()));
        subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }
    if(!getPostcode().equals("")) {
        Query query = new FuzzyQuery(new Term("addrpostcode", getPostcode()));
        subQueryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }

    queryBuilder.add(subQueryBuilder.build(), BooleanClause.Occur.MUST);

    try {
        Query toRun = queryBuilder.build();

        List<Document> searchResults = SearchEngine.getInstance(SEARCH_ENGINE)
                .performSearch(toRun, 50);

        searchResults.forEach(result -> {
            results.add(new Address(result));
        });
    } catch (IOException e) {
        e.printStackTrace();
    }


    return results;
}

当对象提供第一行,第二行和国家时,会产生一个如下所示的文本形式的查询:

(+ addr1:地址行1~2 +((addr2:地址行2~2 addrcountry:romania~2)~1))~2

如上所述,没有任何回报。

我的逻辑在哪里出错?

1 个答案:

答案 0 :(得分:1)

您需要摆脱第一个minimumShouldMatch调用。

setMinimumNumberShouldMatch指定必须匹配的SHOULD个子句数。您的queryBuilder没有SHOULD条款,因此它显然无法匹配其中两个条款,因此您无法获得任何结果。

您可以删除两个setMinimumNumberShouldMatch行,并使查询正常运行。或者,您可以使用minimumShouldMatch逻辑并简化为仅使用一个BooleanQuery,如下所示:

public List<Address> search() {
    List<Address> results = new ArrayList<>();

    BooleanQuery.Builder queryBuilder = new BooleanQuery.Builder();
    queryBuilder.setMinimumNumberShouldMatch(1);

    if(!getLineOne().equals("")) {
        //This is a MUST clause, and so doesn't factor into the minimumShouldMatch
        Query query = new FuzzyQuery(new Term("addr1", getLineOne()));
        queryBuilder.add(query, BooleanClause.Occur.MUST);
    }

    if(!getLineTwo().equals("")) {
        Query query = new FuzzyQuery(new Term("addr2", getLineTwo()));
        queryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }
    if(!getCity().equals("")) {
        Query query = new FuzzyQuery(new Term("addrcity", getCity()));
        queryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }
    if(!getCounty().equals("")) {
        Query query = new FuzzyQuery(new Term("addrcounty", getCounty()));
        queryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }
    if(!getCountry().equals("")) {
        Query query = new FuzzyQuery(new Term("addrcountry", getCountry()));
        queryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }
    if(!getPostcode().equals("")) {
        Query query = new FuzzyQuery(new Term("addrpostcode", getPostcode()));
        queryBuilder.add(query, BooleanClause.Occur.SHOULD);
    }

    try {
        Query toRun = queryBuilder.build();

        List<Document> searchResults = SearchEngine.getInstance(SEARCH_ENGINE)
                .performSearch(toRun, 50);

        searchResults.forEach(result -> {
            results.add(new Address(result));
        });
    } catch (IOException e) {
        e.printStackTrace();
    }

    return results;
}