我有一个pylucene实例,可以在其中查询特定字段。对于我的应用程序,我需要一个索引,我在两个字段上具有完全匹配的索引(例如event_id),而在另一个字段上具有模糊匹配的索引(例如description)。我所有的字段都是text(string)字段。
除了这里有一个类似的问题:Exact match on some fields and "fuzzy" search on others?,我找不到任何可以做到这一点的资源,但这没有涉及pylucene。
现在,我的代码对文本字段进行了简单的搜索:
lucene.initVM()
# ANALYZER
analyzer = StandardAnalyzer()
#Directory
newDir = '/Users/john/checkIndex'
path = Paths.get(newDir)
directory = SimpleFSDirectory(path)
writerConfig = IndexWriterConfig(analyzer)
writer = IndexWriter(directory, writerConfig)
# print writer.numDocs()
# INDEXING ALL DOCUMENTS/ARTICLES IN THE CORPUS
for each in ["john", "doe", "my name is", "my", "name", "word", "python"]:
print(each)
# break
document = Document()
document.add(Field("description", each, TextField.TYPE_STORED))
writer.addDocument(document)
# print(writer.numDocs())
writer.commit()
# writer.close()
directory = SimpleFSDirectory(path)
reader = DirectoryReader.open(directory)
searcher = IndexSearcher(reader)
queryParser = QueryParser("description", analyzer)
query = queryParser.parse(queryParser.escape("name are"))
hits = searcher.search(query, 3)
docsScores = [hit.score for hit in hits.scoreDocs]
if(docsScores != []):
for hit in hits.scoreDocs:
# print hit
eachDoc = []
docT = searcher.doc(hit.doc)
print(docT.get("description"))
现在进行模糊搜索。我了解如何添加新字段,但不确定如何同时对它们进行精确匹配,同时对“说明”进行模糊匹配
有指针吗?