飞快移动 - Slop操作员行为

时间:2014-11-20 17:16:52

标签: python python-2.7 full-text-search whoosh

# Text: income tax expense resulting from the utilization of net operating loss carry forwards

尝试使用查询格式:

q = QueryParser(u"content", ix.schema).parse(u"income utilization~3")
q = QueryParser(u"content", ix.schema).parse(u"'income utilization'~3")

slop运算符似乎不适用于我的用例。它不考虑上述格式中给出的斜率值。即使不符合slop条件,它也总是返回结果。你能帮忙吗?

输出:

 (content:income AND content:utilization)
 <Hit {'title': u'test'}>

完整片段:

import os

from whoosh.fields import Schema, ID, TEXT
from whoosh.index import create_in, open_dir
from whoosh.qparser import QueryParser


schema = Schema(title=ID(stored=True), content=TEXT)

def setup():
    if not os.path.exists("indexdir"):
        os.makedirs("indexdir")

    ix = create_in("indexdir", schema)
    writer = ix.writer()
    writer.add_document(title=u"test", content=u"income tax expense resulting from the utilization of net operating loss carry forwards")
    writer.commit()

def fetch():
    ix = open_dir("indexdir")
    with ix.searcher() as searcher:
        q = QueryParser(u"content", ix.schema).parse(u"income utilization~3")
        print q
        results = searcher.search(q)
        for r in results:
            print r

if __name__ == '__main__':
    setup()
    fetch()

1 个答案:

答案 0 :(得分:0)

您将模糊运算符与slop运算符混淆:

  1. 模糊运算符/编辑距离word~word~n,这些用于模糊术语意味着搜索编辑距离等于word的{​​{1}}
  2. Slop Operator n,这是用于slop等于"word1 word2 ... wordk"~n的短语搜索。
  3. 你应该尝试:

    n

    的引用:

    1. Adding fuzzy term queries
    2. whoosh.qparser.PhrasePlugin