我只是浏览lucene.net上的文章。我使用lucene.net获得了一些创建索引的示例代码,并且我不清楚几行代码。这是那些行
protected void btnCreateIndex_Click(object sender, EventArgs e)
{
IndexWriter writer = new IndexWriter(MapPath("~/searchlucene/"), new StandardAnalyzer(), false);
IndexDocument(writer, "About Hockey", "hockey", "Hockey is a cool sport which I really like, bla bla");
IndexDocument(writer, "Some great players", "hockey", "Some of the great players from Sweden - well Peter Forsberg, Mats Sunding, Henrik Zetterberg");
IndexDocument(writer, "Soccer info", "soccer", "Soccer might not be as fun as hockey but it's also pretty fun");
IndexDocument(writer, "Players", "soccer", "From Sweden we have Zlatan Ibrahimovic and Henrik Larsson. They are the most well known soccer players");
IndexDocument(writer, "1994", "soccer", "I remember World Cup 1994 when Sweden took the bronze. we had great players. players , bla bla");
IndexDocument(writer, "BBA-header", "BBA-321type", "Hello BBA");
writer.Optimize();
writer.Close();
}
private void IndexDocument(IndexWriter writer, string sHeader, string sType, string sContent)
{
Document doc = new Document();
doc.Add(new Field("header", sHeader, Field.Store.YES, Field.Index.TOKENIZED));
doc.Add(new Field("type", sType, Field.Store.YES, Field.Index.TOKENIZED));
doc.Add(new Field("content", sContent, Field.Store.YES, Field.Index.TOKENIZED));
writer.AddDocument(doc);
}
1)doc.Add(new Field(“header”,sHeader,Field.Store.YES,Field.Index.TOKENIZED)); 这条线的含义是什么? Field.Index.TOKENIZED 什么是TOKENIZED&非记号? 当我搜索类型参数中指定的关键字时,什么都没有。 只是不明白行为
这里是搜索示例,其中我指定了索引为
的关键字 ListBox1.Items.Clear();
var searcher = new Lucene.Net.Search.IndexSearcher(MapPath("~/searchlucene/"));
var oParser = new Lucene.Net.QueryParsers.QueryParser("content", new StandardAnalyzer());
string sHeader = " OR (header:" + TextBox1.Text + ")";
string sType = " OR (type:" + TextBox1.Text + ")";
string sSearchQuery = "(" + TextBox1.Text + sHeader + sType + ")";
var oHitColl = searcher.Search(oParser.Parse(sSearchQuery));
for (int i = 0; i < oHitColl.Length(); i++)
{
Document oDoc = oHitColl.Doc(i);
ListBox1.Items.Add(new ListItem(oDoc.Get("header") + oDoc.Get("type") + oDoc.Get("content")));
}
searcher.Close();
请有人帮我理解,以消除我的困惑。感谢
答案 0 :(得分:0)
我刚测试了你的代码,它与Lucene 2.9.4配合使用。
Field.Index.TOKENIZED
表示分析器会破坏令牌中的文本,这意味着它可以全文搜索。您可以将UN_TOKENIZED
用于您不想分析的字段,例如产品ID。
注意:您应该使用Field.Index.ANALYZED
和Field.Index.NOT_ANALYZED
作为弃用的TOKENIZED
/ UN_TOKENIZED
对象的替代品。
要查看已分析与否之间的差异,您可以尝试两者并使用Luke检查索引,这可能会让您对其工作原理有所了解。