Question

我正在一个项目中工作，在该项目中我必须使用语音文本输入作为确定呼叫对象的输入，但是使用语音文本可能会产生一些意外的结果，所以我想对字符串进行一些动态匹配，我我从小处开始，尝试匹配一个名字，我的名字叫尼克·韦斯，我尝试将我的名字与语音文本匹配，但是我也希望它能匹配，例如某些文本是Nik或其他东西，理想情况下，我如果只有1个字母是错误的，我们希望拥有可以与所有内容匹配的东西，因此

尼克 ick 尼克尼克 nck

全部都匹配我的名字，我当前拥有的简单代码是：

  def user_to_call(s):
  if "NICK" or "NIK" in s.upper(): redirect = "Nick"
  if redirect: return redirect

对于4个字母的名称，可以将所有可能性都放入过滤器中，但是对于12个字母的名称，则有点过大，因为我敢肯定这样做可以更有效率。

Answer 1

您需要使用Levenshtein_distance

python实现是nltk

import nltk
nltk.edit_distance("humpty", "dumpty")

Answer 2

据我了解，您没有看到任何模糊匹配。（因为您未批准其他回复）。如果您只是想评估您在请求中指定的内容，则代码如下。我在打印适当的消息时附加了一些其他条件。随时删除它们。

def wordmatch(baseword, wordtoMatch, lengthOfMatch):
    lis_of_baseword = list(baseword.lower())
    lis_of_wordtoMatch = list(wordtoMatch.lower()) 
    sum = 0
    for index_i, i in enumerate(lis_of_wordtoMatch):
        for index_j, j in enumerate(lis_of_baseword):
            if i in lis_of_baseword:
                if i == j and index_i <= index_j:
                    sum = sum + 1
                    break
                else:
                    pass
            else:
                print("word to match has characters which are not in baseword")
                return 0
    if sum >= lengthOfMatch and len(wordtoMatch) <= len(baseword):
        return 1
    elif sum >= lengthOfMatch and len(wordtoMatch) > len(baseword):
        print("word to match has no of characters more than that of baseword")
        return 0
    else:
        return 0

base = "Nick"
tomatch = ["Nick", "ick", "nik", "nic", "nck", "nickey","njick","nickk","nickn"]
wordlength_match = 3 # this says how many words to match in the base word. In your case, its 3

for t_word in tomatch:
    print(wordmatch(base,t_word,wordlength_match))

输出看起来像这样

1
1
1
1
1
word to match has characters which are not in baseword
0
word to match has characters which are not in baseword
0
word to match has no of characters more than that of baseword
0
word to match has no of characters more than that of baseword
0

让我知道这是否达到您的目的。

Answer 3

您基本上需要的是模糊字符串匹配，请参阅：

https://en.wikipedia.org/wiki/Approximate_string_matching

https://www.datacamp.com/community/tutorials/fuzzy-string-python

基于此，您可以检查输入内容与字典的相似程度：

 from fuzzywuzzy import fuzz

 name = "nick"
 tomatch = ["Nick", "ick", "nik", "nic", "nck", "nickey", "njick", "nickk", "nickn"]
 for str in tomatch:
    ratio = fuzz.ratio(str.lower(), name.lower())
    print(ratio)

此代码将产生以下输出：

您必须尝试不同的比率并进行检查，以符合只丢失一个字母的要求

python如何在字符串中动态查找人名

3 个答案: