我需要帮助确定在给定情况下使用哪些查询类型。
我认为我说得对,如果我在lucene字段中存储“FORD”这个词,我想找到完全匹配,我会使用TermQuery
吗?
但是我应该使用哪种查询类型,如果我在寻找单词“FORD”,其中字段的内容存储为: -
“FORD | HONDA | SUZUKI”
如果我要搜索整个页面的内容,寻找短语怎么办?比如“请帮助我”?
答案 0 :(得分:3)
如果您想在FORD|HONDA|SUZUKI
中搜索FORD,请使用 Field.Index.ANALYZED 进行索引,或将其存储如下以使用 TermQuery
var analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);
var fs = FSDirectory.Open("test.index");
//Index a Test Document
IndexWriter wr = new IndexWriter(fs, analyzer, true, IndexWriter.MaxFieldLength.LIMITED);
var doc = new Document();
doc.Add(new Field("Model", "FORD", Field.Store.YES, Field.Index.NOT_ANALYZED));
doc.Add(new Field("Model", "HONDA", Field.Store.YES, Field.Index.NOT_ANALYZED));
doc.Add(new Field("Model", "SUZUKI", Field.Store.YES, Field.Index.NOT_ANALYZED));
doc.Add(new Field("Text", @"What if i was to search the contents of an entire page, looking for a phrase? such as ""please help me""?",
Field.Store.YES, Field.Index.ANALYZED));
wr.AddDocument(doc);
wr.Commit();
var reader = wr.GetReader();
var searcher = new IndexSearcher(reader);
//Use TermQuery for "NOT_ANALYZED" fields
var result = searcher.Search(new TermQuery(new Term("Model", "FORD")), 100);
foreach (var item in result.ScoreDocs)
{
Console.WriteLine("1)" + reader.Document(item.Doc).GetField("Text").StringValue);
}
//Use QueryParser for "ANALYZED" fields
var qp = new QueryParser(Lucene.Net.Util.Version.LUCENE_30, "Text", analyzer);
result = searcher.Search(qp.Parse(@"""HELP ME"""), 100);
foreach (var item in result.ScoreDocs)
{
Console.WriteLine("2)" + reader.Document(item.Doc).GetField("Text").StringValue);
}
TermQuery 意味着您要搜索术语,因为它存储在索引中,这取决于您为该字段编制索引的方式(NOT_ANALYZED,ANALYZED + WhichAnalyzer)。最常见的用途是使用 NOT_ANALYZED 字段。
你也可以将 TermQuery 与 ANALYZED 字段一起使用,但是你应该知道 analyzer 如何标记你的输入字符串。下面是一个示例,了解分析器如何标记您的输入
var text = @"What if i was to search the contents of an entire page, looking for a phrase? such as ""please help me""?";
var analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30 );
//var analyzer = new WhitespaceAnalyzer();
//var analyzer = new KeywordAnalyzer();
//var analyzer = new SimpleAnalyzer();
var ts = analyzer.TokenStream("", new StringReader(text));
var termAttr = ts.GetAttribute<ITermAttribute>();
while (ts.IncrementToken())
{
Console.Write("[" + termAttr.Term + "] " );
}
答案 1 :(得分:1)
我会把问题横向转向,所以我将每个字段的多个值分别放在索引中 - 这应该使搜索更简单。查看Field Having Multiple Values可能会有所帮助。