Lucene .NET搜索结果

时间:2012-01-25 15:03:15

标签: c# c#-4.0 lucene lucene.net

我正在使用此代码进行索引:

public void IndexEmployees(IEnumerable<Employee> employees)
{
    var indexPath = GetIndexPath();
    var directory = FSDirectory.Open(indexPath);

    var indexWriter = new IndexWriter(directory, new StandardAnalyzer(Version.LUCENE_29), true, IndexWriter.MaxFieldLength.UNLIMITED);

    foreach (var employee in employees)
    {
        var document = new Document();
        document.Add(new Field("EmployeeId", employee.EmployeeId.ToString(), Field.Store.YES, Field.Index.NO, Field.TermVector.NO));
        document.Add(new Field("Name", employee.FirstName + " " + employee.LastName, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));
        document.Add(new Field("OfficeName", employee.OfficeName, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));
        document.Add(new Field("CompetenceRatings", string.Join(" ", employee.CompetenceRatings.Select(cr => cr.Name)), Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));

        indexWriter.AddDocument(document);
    }

    indexWriter.Optimize();
    indexWriter.Close();

    var indexReader = IndexReader.Open(directory, true);
    var spell = new SpellChecker.Net.Search.Spell.SpellChecker(directory);
    spell.ClearIndex();

    spell.IndexDictionary(new LuceneDictionary(indexReader, "Name"));
    spell.IndexDictionary(new LuceneDictionary(indexReader, "OfficeName"));
    spell.IndexDictionary(new LuceneDictionary(indexReader, "CompetenceRatings"));
}

public DirectoryInfo GetIndexPath()
{
    return new DirectoryInfo(HttpContext.Current.Server.MapPath("/App_Data/EmployeeIndex/"));
}

这段代码可以找到结果(以及建议):

public SearchResult Search(DirectoryInfo indexPath, string[] searchFields, string searchQuery)
{
    var directory = FSDirectory.Open(indexPath);

    var standardAnalyzer = new StandardAnalyzer(Version.LUCENE_29);

    var indexReader = IndexReader.Open(directory, true);
    var indexSearcher = new IndexSearcher(indexReader);

    var parser = new MultiFieldQueryParser(Version.LUCENE_29, searchFields, standardAnalyzer);
    //parser.SetDefaultOperator(QueryParser.Operator.OR);
    var query = parser.Parse(searchQuery);

    var hits = indexSearcher.Search(query, null, 5000);

    return new SearchResult
                {
                    Suggestions = FindSuggestions(indexPath, searchQuery),
                    LuceneDocuments = hits
                        .scoreDocs
                        .Select(scoreDoc => indexSearcher.Doc(scoreDoc.doc))
                        .ToArray()
                };
}

public string[] FindSuggestions(DirectoryInfo indexPath, string searchQuery)
{
    var directory = FSDirectory.Open(indexPath);

    var spell = new SpellChecker.Net.Search.Spell.SpellChecker(directory);

    var similarWords = spell.SuggestSimilar(searchQuery, 20);

    return similarWords;
}

var searchResult = Search(GetIndexPath(), new[] { "Name", "OfficeName", "CompetenceRatings" }, "admin*");

简单的查询,例如:admin或admin *不会给我任何结果。我知道有一名员工有这个名字。如果我寻找詹姆斯,我希望能找到詹姆斯詹姆森。

谢谢!

3 个答案:

答案 0 :(得分:4)

第一件事。您必须将更改提交到索引。

indexWriter.Optimize();
indexWriter.Commit(); //Add This
indexWriter.Close();

编辑#2 此外,保持简单,直到你得到一些有用的东西。

评论这些东西。

//var indexReader = IndexReader.Open(directory, true);
//var spell = new SpellChecker.Net.Search.Spell.SpellChecker(directory);
//spell.ClearIndex();

//spell.IndexDictionary(new LuceneDictionary(indexReader, "Name"));
//spell.IndexDictionary(new LuceneDictionary(indexReader, "OfficeName"));
//spell.IndexDictionary(new LuceneDictionary(indexReader, "CompetenceRatings"));

编辑#3

您正在搜索的字段可能不会经常更改。我会将它们包含在您的搜索功能中。

string[] fields = new string[] { "Name", "OfficeName", "CompetenceRatings" };

我建议这一点的最大原因是Fields是区分大小写的,有时你不会得到任何结果,这是因为你搜索“name”字段(不存在)而不是“Name”字段。更容易以这种方式发现错误。

答案 1 :(得分:1)

在我与Lucene一起工作的经验中,我发现你必须建立自己的查询才能获得“谷歌”般的行为。这是我做的,YMMV,但它在我的应用程序中生成预期的结果。基本思想是组合术语查询(完全匹配),前缀查询(以术语开头的任何内容),以及搜索字符串中每个术语的模糊查询。下面的代码不会编译,但会给你一个想法

Query GetQuery(string querystring)
{

   Search.Search.BooleanQuery query = new Search.Search.BooleanQuery();

   Search.Analysis.TokenStream tk = StandardAnalyzerInstance.TokenStream(null, new StringReader(querystring));
   Search.Analysis.Tokenattributes.TermAttribute ta = tk.GetAttribute(typeof(Search.Analysis.Tokenattributes.TermAttribute)) as Search.Analysis.Tokenattributes.TermAttribute;

    while (tk.IncrementToken())
    {
         string term = ta.Term();
         Search.Search.BooleanQuery bq = new Search.Search.BooleanQuery();
         bq.Add(new Search.Search.TermQuery(new Search.Index.Term("fieldToQuery", term)), Search.Search.BooleanClause.Occur.SHOULD);
         bq.Add(new Search.Search.PrefixQuery(new Search.Index.Term("fieldToQuery", term)), Search.Search.BooleanClause.Occur.SHOULD);
         bq.Add(new Search.Search.FuzzyQuery(new Search.Index.Term("fieldToQuery", term)), Search.Search.BooleanClause.Occur.SHOULD);
         query.Add(bq, Search.Search.BooleanClause.Occur.MUST);
    }

    return query;
}

答案 2 :(得分:0)

继承了Parse()方法。您是否尝试过使用返回Query对象的静态方法?

Parse(Version matchVersion, String[] queries, String[] fields, Analyzer analyzer)