Lucene.Net索引文档Parallel.ForEach-“对象引用”错误

时间:2018-10-14 22:52:40

标签: c# concurrency lucene lucene.net parallel.foreach

我正在使用Lucene.Net来索引各种驱动器上的文件。我读到IndexWriter是线程安全的,并认为我可以通过使用Parallel.ForEach循环来加快文档的“读取”速度并将其写入索引,但是我继续得到未设置为“对象引用”的对象实例错误地断断续续地实例化,使我认为我正在执行的操作不是线程安全的,即使我在将其添加到编写器之前使用空检查来确保该对象不为空也是如此。这是代码;关于可能导致此错误的任何指示或想法?我已经在下面的代码中对其进行了标记-它位于CreateDocument()行上。

public class LuceneIndexService : LuceneService
{

    private IndexWriter writer;
    private IndexWriterConfig config;

    public LuceneIndexService(string _indexLocation) : base(_indexLocation)
    {
        config = new IndexWriterConfig(Lucene.Net.Util.LuceneVersion.LUCENE_48, analyzer);
        writer = new IndexWriter(luceneIndexDirectory, config);
    }

    public void CreateIndex(List<string> filesToIndex, BackgroundWorker bgWorker)
    {
        // Background worker is passed to report progress
        BackgroundWorker bgWorkerToReportTo = bgWorker;
        int fileCount = filesToIndex.Count;
        int progressCount = 0;

        // Writing files to index
        Parallel.ForEach(filesToIndex, (filepath) =>
        {
            FileData documentData = FileReader.GetDataFromFile(filepath);
            //null check
            if (documentData == null || documentData.content == null || documentData.filepath == null || documentData.filesize == null)
                Console.WriteLine(filepath + " IS NULL");
            else
            {
                // ** This is where I get the error **
                CreateDocument(documentData);
            }

            // Reporting progress so progress bar can be updated
            int percentage = (Interlocked.Increment(ref progressCount)) * 100 / fileCount;
            bgWorkerToReportTo.ReportProgress(percentage);
        });

        writer.Flush(true, true);
        writer.Dispose();
    }

    private void CreateDocument(FileData data)
    {
        Document document = new Document();
        document.AddTextField("content", data.content, Field.Store.YES);
        document.AddStringField("filepath", data.filepath, Field.Store.YES);
        document.AddStringField("filesize", data.filesize, Field.Store.YES);

        writer.AddDocument(document);
    }
}

这是上面的类继承的抽象类-它只是声明一个公共分析器,多个类(作家,搜索者等)可以使用并“打开”目录。

public abstract class LuceneService
{
    internal Analyzer analyzer = new StandardAnalyzer(Lucene.Net.Util.LuceneVersion.LUCENE_48);
    internal Directory luceneIndexDirectory;
    internal string IndexLocation;

    public LuceneService(string _indexLocation)
    {
        IndexLocation = _indexLocation;
        luceneIndexDirectory = FSDirectory.Open(_indexLocation);
    }
}

0 个答案:

没有答案