我想在两个名为“a”和“b”的字段中搜索我的索引。我收到Freud -- theories of psychology
之类的搜索,我想执行以下查询:
(a="Freud" AND b="theories of psychology") OR (b="Freud" AND a="theories of psychology")
我该怎么做?到目前为止,我让Lucene使用firstHalf
构建了两半(secondHalf
和MultiFieldQueryParser
),然后我将它们与
BooleanQuery combined = new BooleanQuery();
combined.add(firstHalf, BooleanClause.Occur.SHOULD);
combined.add(secondHalf, BooleanClause.Occur.SHOULD);
但是combined
允许返回结果,只找到“理论”而不是“心理学”,我绝对想要这两个术语。似乎Lucene将“心理学理论”分为三个词,并将它们与OR单独组合。我该如何防止这种情况?
firstHalf
看起来像:
Query firstHalf = MultiFieldQueryParser.parse(Version.LUCENE_33,
new String[]{"Freud", "theories of psychology"},
new String[]{"a", "b"},
new BooleanClause.Occur[]{BooleanClause.Occur.MUST, BooleanClause.Occur.MUST},
analyzer);
其中analyzer
只是一个StandardAnalyzer
对象。
答案 0 :(得分:4)
自己搞清楚,但现在代码明显更长;如果有人知道更优雅的解决方案,请发帖,我很乐意奖励。 :)(虽然我很快就会将其变成一种方法......但这里是正在发生的完整版本......)
QueryParser parser = new QueryParser(Version.LUCENE_33, "a", analyzer);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
Query a_0 = parser.parse("Freud");
parser = new QueryParser(Version.LUCENE_33, "b", analyzer);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
Query b_1 = parser.parse("theories of psychology");
BooleanQuery firstHalf = new BooleanQuery();
firstHalf.add(a_0, BooleanClause.Occur.MUST);
firstHalf.add(b_1, BooleanClause.Occur.MUST);
parser = new QueryParser(Version.LUCENE_33, "b", analyzer);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
Query b_0 = parser.parse("Freud");
parser = new QueryParser(Version.LUCENE_33, "a", analyzer);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
Query a_1 = parser.parse("theories of psychology");
BooleanQuery secondHalf = new BooleanQuery();
secondHalf.add(b_0, BooleanClause.Occur.MUST);
secondHalf.add(a_1, BooleanClause.Occur.MUST);
BooleanQuery combined = new BooleanQuery();
combined.add(firstHalf, BooleanClause.Occur.SHOULD);
combined.add(secondHalf, BooleanClause.Occur.SHOULD);
结果SHOULD
确实按我需要的方式工作。希望有人发现这有用,我不只是在公开场合说话;)
答案 1 :(得分:2)
标准分析器将标记化。因此,查询theories of psychology
等同于theories OR of OR psychology
。
如果您要搜索“心理学理论”这一短语,请使用PhraseQuery,或者请注意默认的QueryParser会将引号解释为短语(即将您的代码更改为"\"theories of psychology\""
)。
是的,有一种感觉,Lucene不使用布尔逻辑,但它是技术性的,而不是真正相关的。
答案 2 :(得分:2)
我写下面的类来生成链式模糊查询,其中必须在多个字段中搜索一个术语。
可以通过调用GetQuery()
方法来检索组合查询。
public class QueryParam
{
public string[] Fields { get; set; }
public string Term { get; set; }
private QueryParam _andOperandSuffix;
private QueryParam _orOperandSuffix;
private readonly BooleanQuery _indexerQuery = new BooleanQuery();
public QueryParam(string term, params string[] fields)
{
Term = term;
Fields = fields;
}
public QueryParam And(QueryParam queryParam)
{
_andOperandSuffix = queryParam;
return this;
}
public QueryParam Or(QueryParam queryParam)
{
_orOperandSuffix = queryParam;
return this;
}
public BooleanQuery GetQuery()
{
foreach (var field in Fields)
_indexerQuery.Add(new FuzzyQuery(new Term(field, Term)), Occur.SHOULD);
if (_andOperandSuffix != null)
_indexerQuery.Add(_andOperandSuffix.GetQuery(),Occur.MUST);
if (_orOperandSuffix != null)
_indexerQuery.Add(_orOperandSuffix.GetQuery(), Occur.SHOULD);
return _indexerQuery;
}
}
示例:
var leftquery = new QueryParam("Freud", "a").And(new QueryParam("theories of psychology", "b"));
var rightquery = new QueryParam("Freud", "b").And(new QueryParam("theories of psychology", "a"));
var query = leftquery.Or(rightquery);