现在这两天一直在努力,只是无法用indexWriter.deleteDocuments(term)
在这里,我将放置将要进行测试的代码,希望有人可以指出我做错了什么,已经尝试过的事情:
2.x
更新为5.x
indexWriter.deleteDocuments()
代替indexReader.deleteDocuments()
indexOption
配置为NONE
或DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS
这里是代码:
import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.FieldType;
import org.apache.lucene.index.*;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import java.io.IOException;
import java.nio.file.Paths;
public class TestSearch {
static SimpleAnalyzer analyzer = new SimpleAnalyzer();
public static void main(String[] argvs) throws IOException, ParseException {
generateIndex("5836962b0293a47b09d345f1");
query("5836962b0293a47b09d345f1");
delete("5836962b0293a47b09d345f1");
query("5836962b0293a47b09d345f1");
}
public static void generateIndex(String id) throws IOException {
Directory directory = FSDirectory.open(Paths.get("/tmp/test/lucene"));
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
FieldType fieldType = new FieldType();
fieldType.setStored(true);
fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
Field idField = new Field("_id", id, fieldType);
Document doc = new Document();
doc.add(idField);
iwriter.addDocument(doc);
iwriter.close();
}
public static void query(String id) throws ParseException, IOException {
Query query = new QueryParser("_id", analyzer).parse(id);
Directory directory = FSDirectory.open(Paths.get("/tmp/test/lucene"));
IndexReader ireader = DirectoryReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(ireader);
ScoreDoc[] scoreDoc = isearcher.search(query, 100).scoreDocs;
for(ScoreDoc scdoc: scoreDoc){
Document doc = isearcher.doc(scdoc.doc);
System.out.println(doc.get("_id"));
}
}
public static void delete(String id){
try {
Directory directory = FSDirectory.open(Paths.get("/tmp/test/lucene"));
IndexWriterConfig config = new IndexWriterConfig(analyzer);
IndexWriter iwriter = new IndexWriter(directory, config);
Term term = new Term("_id", id);
iwriter.deleteDocuments(term);
iwriter.commit();
iwriter.close();
}catch (IOException e){
e.printStackTrace();
}
}
}
首先generateIndex()
将在/tmp/test/lucene
中生成索引,query()
将显示id
将成功查询,然后delete()
希望删除文档,但query()
将再次证明删除操作失败。
以下是有人可能需要进行测试的pom依赖
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>5.5.4</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
<version>5.5.4</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>5.5.4</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-smartcn</artifactId>
<version>5.5.4</version>
</dependency>
迫切希望得到答案。
答案 0 :(得分:1)
您的问题出在分析仪上。 SimpleAnalyzer
将标记定义为字母的最大字符串(StandardAnalyzer
,甚至WhitespaceAnalyzer
,是更典型的选项),因此您要编制索引的值会被拆分为代币:&#34; b&#34;,&#34; a&#34;,&#34; b&#34;,&#34; d&#34;,&#34; f&#34;。您定义的删除方法虽然没有通过分析器,但只是创建一个原始术语。如果您尝试使用以下内容替换main
,则可以看到此操作:
generateIndex("5836962b0293a47b09d345f1");
query("5836962b0293a47b09d345f1");
delete("b");
query("5836962b0293a47b09d345f1");
作为一般规则,查询和术语以及此类不会分析,QueryParser会这样做。
对于(看起来像什么)标识符字段,您可能根本不想分析此字段。在这种情况下,将其添加到FieldType:
fieldType.setTokenized(false);
然后,您必须更改查询(再次,QueryParser分析),然后使用TermQuery
。
Query query = new TermQuery(new Term("_id", id));