pylucene-在某些领域完全匹配而在某些领域模糊匹配

时间:2019-07-08 06:23:12

标签: python lucene pylucene

我有一个pylucene实例,可以在其中查询特定字段。对于我的应用程序,我需要一个索引,我在两个字段上具有完全匹配的索引(例如event_id),而在另一个字段上具有模糊匹配的索引(例如description)。我所有的字段都是text(string)字段。

除了这里有一个类似的问题:Exact match on some fields and "fuzzy" search on others?,我找不到任何可以做到这一点的资源,但这没有涉及pylucene。

现在,我的代码对文本字段进行了简单的搜索:

lucene.initVM()
    # ANALYZER
analyzer = StandardAnalyzer()
#Directory


newDir = '/Users/john/checkIndex'

path = Paths.get(newDir)
directory = SimpleFSDirectory(path)
writerConfig = IndexWriterConfig(analyzer)
writer = IndexWriter(directory, writerConfig)

# print writer.numDocs()
# INDEXING ALL DOCUMENTS/ARTICLES IN THE CORPUS
for each in ["john", "doe", "my name is", "my", "name", "word", "python"]:
    print(each)
    # break
    document = Document()
    document.add(Field("description", each, TextField.TYPE_STORED))

    writer.addDocument(document)

# print(writer.numDocs())
writer.commit()
# writer.close()

directory = SimpleFSDirectory(path)

reader = DirectoryReader.open(directory)
searcher = IndexSearcher(reader)
queryParser = QueryParser("description", analyzer)

query = queryParser.parse(queryParser.escape("name are"))
hits = searcher.search(query, 3)
docsScores = [hit.score for hit in hits.scoreDocs]
if(docsScores != []):
    for hit in hits.scoreDocs:
        # print hit
        eachDoc = []
        docT = searcher.doc(hit.doc)
        print(docT.get("description"))

现在进行模糊搜索。我了解如何添加新字段,但不确定如何同时对它们进行精确匹配,同时对“说明”进行模糊匹配

有指针吗?

0 个答案:

没有答案