我正在一个项目中工作,在该项目中我必须使用语音文本输入作为确定呼叫对象的输入,但是使用语音文本可能会产生一些意外的结果,所以我想对字符串进行一些动态匹配,我我从小处开始,尝试匹配一个名字,我的名字叫尼克·韦斯,我尝试将我的名字与语音文本匹配,但是我也希望它能匹配,例如某些文本是Nik或其他东西,理想情况下,我如果只有1个字母是错误的,我们希望拥有可以与所有内容匹配的东西,因此
尼克 ick 尼克 尼克 nck
全部都匹配我的名字,我当前拥有的简单代码是:
def user_to_call(s):
if "NICK" or "NIK" in s.upper(): redirect = "Nick"
if redirect: return redirect
对于4个字母的名称,可以将所有可能性都放入过滤器中,但是对于12个字母的名称,则有点过大,因为我敢肯定这样做可以更有效率。
答案 0 :(得分:1)
答案 1 :(得分:0)
据我了解,您没有看到任何模糊匹配。 (因为您未批准其他回复)。 如果您只是想评估您在请求中指定的内容,则代码如下。我在打印适当的消息时附加了一些其他条件。随时删除它们。
def wordmatch(baseword, wordtoMatch, lengthOfMatch):
lis_of_baseword = list(baseword.lower())
lis_of_wordtoMatch = list(wordtoMatch.lower())
sum = 0
for index_i, i in enumerate(lis_of_wordtoMatch):
for index_j, j in enumerate(lis_of_baseword):
if i in lis_of_baseword:
if i == j and index_i <= index_j:
sum = sum + 1
break
else:
pass
else:
print("word to match has characters which are not in baseword")
return 0
if sum >= lengthOfMatch and len(wordtoMatch) <= len(baseword):
return 1
elif sum >= lengthOfMatch and len(wordtoMatch) > len(baseword):
print("word to match has no of characters more than that of baseword")
return 0
else:
return 0
base = "Nick"
tomatch = ["Nick", "ick", "nik", "nic", "nck", "nickey","njick","nickk","nickn"]
wordlength_match = 3 # this says how many words to match in the base word. In your case, its 3
for t_word in tomatch:
print(wordmatch(base,t_word,wordlength_match))
输出看起来像这样
1
1
1
1
1
word to match has characters which are not in baseword
0
word to match has characters which are not in baseword
0
word to match has no of characters more than that of baseword
0
word to match has no of characters more than that of baseword
0
让我知道这是否达到您的目的。
答案 2 :(得分:0)
您基本上需要的是模糊字符串匹配,请参阅:
https://en.wikipedia.org/wiki/Approximate_string_matching
https://www.datacamp.com/community/tutorials/fuzzy-string-python
基于此,您可以检查输入内容与字典的相似程度:
from fuzzywuzzy import fuzz
name = "nick"
tomatch = ["Nick", "ick", "nik", "nic", "nck", "nickey", "njick", "nickk", "nickn"]
for str in tomatch:
ratio = fuzz.ratio(str.lower(), name.lower())
print(ratio)
此代码将产生以下输出:
100
86
86
86
86
80
89
89
89
您必须尝试不同的比率并进行检查,以符合只丢失一个字母的要求