我有这样的文件:
{
"_id" : ObjectId("586b723b4b9a835db416fa26"),
"name" : "test",
"countries" : {
"country" : [
{
"name" : "russia iraq"
},
{
"name" : "USA china"
}
]
}
}
在MongoDB中,我试图使用短语查询(Lucene 6.2.0)来检索它。我的代码看起来很简单:
StandardAnalyzer analyzer = new StandardAnalyzer();
// 1. create the index
Directory index = new RAMDirectory();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
try {
IndexWriter w = new IndexWriter(index, config);
MongoClient client = new MongoClient("localhost", 27017);
DB database = client.getDB("test123");
DBCollection coll = database.getCollection("test1");
//MongoCollection<org.bson.Document> collection = database.getCollection("test1");
DBCursor cursor = coll.find();
System.out.println(cursor);
while (cursor.hasNext()) {
BasicDBObject obj = (BasicDBObject) cursor.next();
Document doc = new Document();
BasicDBObject f = (BasicDBObject) (obj.get("countries"));
List<BasicDBObject> dts = (List<BasicDBObject>)(f.get("country"));
doc.add(new TextField("id",obj.get("_id").toString().toLowerCase(), Field.Store.YES));
doc.add(new StringField("name",obj.get("name").toString(), Field.Store.YES));
doc.add(new StringField("countries",f.toString(), Field.Store.YES));
for(BasicDBObject d : dts){
doc.add(new StringField("country",d.get("name").toString(), Field.Store.YES));
//
}
w.addDocument(doc);
}
w.close();
,我的搜索结果如下:
PhraseQuery query = new PhraseQuery("country", "iraq russia" );
// 3. search
int hitsPerPage = 10;
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
TopDocs docs = searcher.search(query, hitsPerPage);
ScoreDoc[] hits = docs.scoreDocs;
// 4. display results
System.out.println("Found " + hits.length + " hits.");
for(int j=0;j<hits.length;++j) {
int docId = hits[j].doc;
Document d = searcher.doc(docId);
System.out.println(d);
}
reader.close();
}
catch (Exception e) {
e.printStackTrace();
}
我对此查询的命中率为零。谁能说出我做错了什么? 使用的罐子: Lucene的-queries4.2.0 的Lucene的QueryParser-6.2.1 lucene的-分析器-共6.2.0
答案 0 :(得分:0)
首先,永远不要混合Lucene版本。你的所有罐子都应该是同一个版本。将lucene-queries升级到6.2.1。在实践中,你可能会或可能不会遇到混淆6.2.0和6.2.1的问题,但你肯定应该升级lucene-analyzers-common。
PhraseQuery没有为您分析,您必须单独添加术语。在你的例子中,&#34;伊拉克俄罗斯&#34;被视为单个术语,而不是两个单独的(分析)术语。
看起来应该是这样的:
Query query = new PhraseQuery.Builder()
.add(new Term("country", "iraq"))
.add(new Term("country", "russia"))
.build();
如果您想要为您分析的内容,可以使用QueryParser:
QueryParser parser = new QueryParser("country", new StandardAnalyzer())
Query query = queryparser.parse("\"iraq russia\"");
答案 1 :(得分:0)
我做了一些改变,如:
Query query = new PhraseQuery.Builder()
.add(new Term("country", "iraq"))
.add(new Term("country", "russia"))
.setSlop(2)
.build();
并且我还在索引时更改了feild的类型:
for(BasicDBObject d : dts){
doc.add(newTextField("country",d.get("name").toString(), Field.Store.YES));
}
但有人能告诉我索引时StringFeild和TextFeild之间的区别吗?