从lucene中按术语删除文档

时间:2010-09-14 16:36:04

标签: lucene

以下代码不会按预期删除文档:

        RAMDirectory idx     = new RAMDirectory();
        IndexWriter writer  = new IndexWriter(idx,
                                   new SnowballAnalyzer(Version.LUCENE_30, "English"),
                                   IndexWriter.MaxFieldLength.LIMITED);
        Document doc = new Document();
        doc.add(new Field("title", "mydoc", Field.Store.YES, Field.Index.NO));
        doc.add(new Field("content", "some content, deleteme", Field.Store.YES, Field.Inde
x.ANALYZED));
        writer.addDocument(doc);
        Document doc2 = new Document();        
        doc2.add(new Field("title", "mydoc2", Field.Store.YES, Field.Index.NO));
        doc2.add(new Field("content", "other content, don't deleteme", Field.Store.YES, Field.I
ndex.ANALYZED));
        writer.addDocument(doc2);
        writer.optimize();
        writer.close();

        /*
        IndexReader reader = IndexReader.open(idx, false);
        int docs_up_for_deletion = reader.docFreq(new Term("title"));
        int before = reader.numDocs();
        int docs_deleted = reader.deleteDocuments(new Term("title", "mydoc"));
        reader.close();
        */

        IndexWriter writer2  = new IndexWriter(idx,
                                   new SnowballAnalyzer(Version.LUCENE_30, "English"),
                                   IndexWriter.MaxFieldLength.LIMITED);
        int before = writer2.numDocs();
        writer2.deleteDocuments(new Term("title", "mydoc"));
        writer2.commit();
        writer2.optimize();
        int after = writer2.numDocs();
        writer2.close();
        int docs_deleted = before - after;

我尝试使用IndexReader和IndexWriter进行删除,但都不起作用。

我还尝试在上面的代码之后添加另一个IndexReader搜索,以防万一这个数字只在关闭writer2之后得到更新(在this FAQ中提到),但这没有帮助。执行writer.deleteAll()工作,而不是按术语删除。

我发现了一个旧的引用,即只能删除Field.Keyword类型的字段,但这不再是Lucene 3.x中的有效字段类型

1 个答案:

答案 0 :(得分:1)

您的标题字段未编入索引。变化

new Field("title", "mydoc", Field.Store.YES, Field.Index.NO)

new Field("title", "mydoc", Field.Store.YES, Field.Index.ANALYZED)

new Field("title", "mydoc", Field.Store.YES, Field.Index.NOT_ANALYZED)

取决于您是否希望分析您的田地。