Lucene Index包含错误数据

时间:2014-08-19 20:02:02

标签: c# search lucene full-text-search

我正在使用Lucene.NET构建搜索索引。我如何预先形成搜索大多数数据是正确的,除了一些条目。我尝试将问题追踪到源头,但我没有线索,我想知道我的所有代码是否正确。

public static readonly string[] FIELDNAMES = { "Name", "ExtID", "Description", "FieldType", "ContentType", "ID" };


    public static List<SearchResults> Search(string[] propertyNames, string propertyValue)
    {
        using (var dir = FSDirectory.Open(new DirectoryInfo(IndexLocation)))
        {
            using (IndexReader ir = IndexReader.Open(dir, true))
            {
                Searcher searcher = new IndexSearcher(ir);

                SearchAnalyzer analyzer = new SearchAnalyzer();
                var queryParser = new MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_30, propertyNames, analyzer);


                var query = queryParser.Parse(propertyValue + '*');

                TopDocs resultDocs = searcher.Search(query, ir.MaxDoc);

                var topDocs = resultDocs.ScoreDocs;

                List<SearchResults> AllSearchResults = new List<SearchResults>();

                foreach (var hit in topDocs)
                {
                    var DFS = searcher.Doc(hit.Doc);
                    var Name = DFS.Get(FIELDNAMES[0]);
                    var ExtID = DFS.Get(FIELDNAMES[1]);
                    int Res = Int32.Parse(DFS.Get(FIELDNAMES[3]));
                    int ID = Int32.Parse(DFS.Get(FIELDNAMES[5]));

                    AllSearchResults.Add(new SearchResults(ID, Name, ExtID, (ResultsType)Res));
                }

                return AllSearchResults;
            }
        }
    }

    /// <summary>
    /// Rebuilds Lucene Indexes
    /// </summary>
    /// <exception> Throws Exceptions if directory is in-use or cannot rebuild index </exception>
    /// <param name="Data">Data Base Connections</param>
    public static void RebuildLuceneIndex(DBData Data)
    {
        //var dir = new DirectoryInfo(IndexLocation);
        //dir.Delete();

        System.IO.Directory.Delete(IndexLocation, true);
        System.IO.Directory.CreateDirectory(IndexLocation);

        CreateIndex(Data);

    }


    public static void CreateIndex(DBData Data)
    {
        var Model = Data.MK3Model;

        using (var dir = FSDirectory.Open(new DirectoryInfo(IndexLocation)))
        {
            SearchAnalyzer analyzer = new SearchAnalyzer();
            using (var writer = new IndexWriter(dir, analyzer, true, IndexWriter.MaxFieldLength.LIMITED))
            {
                var SeriesIds = GetSeriesExternalIDs(Model);
                TitlesToDocument(Model, writer, SeriesIds);
                ProducerToDocument(Model, writer);
                CollectionToDocument(Data.FMGModel, writer);
                writer.Optimize();

            }
        }
    }

     private static void TitlesToDocument(MK3Entities Model, IndexWriter writer, List<string> Series)
    {
        try
        {
            foreach (var t in Model.Titles.Where(t => Series.Contains(t.ExtTitleID)).ToList())
            {
                GenerateTitleDocument(writer, t, true);
            }
            foreach (var t in Model.Titles.Where(t => !Series.Contains(t.ExtTitleID)).ToList())
            {
                GenerateTitleDocument(writer, t);
            }
        }
        catch (Exception ex)
        {
            MMSLogger.Instance.WriteToLog("Error Creating Titles Lucene index ex: " + ex.Message);
        }
    }

进入索引的数据很好,你可以在这个img中看到(变量在观察窗口中)Code stopped at debug point with variables in watch window

我的搜索结果当我搜索'Jimm'After Searching Jimm时返回此信息 Id 2026的名字是“Prentice Hall Understanding Music”,而不是Jimmy Cater Library,也没有具体名称的标题。关于如何解决这个问题的任何想法?

1 个答案:

答案 0 :(得分:1)

我会用Luke检查索引,以验证输入的数据是否是您在索引中找到的数据,就像Luke一样,您可以假设查询逻辑至少是正确的。