我需要以下代码的帮助。在这种情况下,我需要找到与输入单词最接近的单词,以测试我将word_0设置为“ pikaru”,应返回“ pikachu”。 levenshtein函数返回我们输入的两个单词之间的距离。当我运行下面的代码时,我得到的答案是charmander,该方法还差得远,我们将不胜感激。
import backend
name_to_stats, id_to_name, names,
pokemon_by_typebackend.get_pokemon_stats()
words = names
word_0 = 'pikaru'
def find_closest_word(word_0, words):
"""Finds the closest word in the list to word_0 as measured by the
Levenshtein distance
Args:
word_0: a str
words: a list of str
Returns:
The closest word in words to word_0 as a str.
"""
# Hint: use the levenshtein_distance() function to help you out here.
closest_word = words[0]
#closest_distance = levenshtein_distance(word_0, words[0])
for i in words:
distance = levenshtein_distance(word_0, closest_word)
new_distance = levenshtein_distance(word_0, i)
if distance < new_distance:
return i
def levenshtein_distance(s1, s2):
"""Returns the Levenshtein distance between strs s1 and s2
Args:
s1: a str
s2: a str
"""
# This function has already been implemented for you.
# Source of the implementation:
# https://stackoverflow.com/questions/2460177/edit-distance-in-python
# If you'd like to know more about this algorithm, you can study it in
# CSCC73 Algorithms. It applies an advanced technique called dynamic
# programming.
# For more information:
# https://en.wikipedia.org/wiki/Levenshtein_distance
# https://en.wikipedia.org/wiki/Dynamic_programming
if len(s1) > len(s2):
s1, s2 = s2, s1
distances = range(len(s1) + 1)
for i2, c2 in enumerate(s2):
distances_ = [i2+1]
for i1, c1 in enumerate(s1):
if c1 == c2:
distances_.append(distances[i1])
else:
distances_.append(1 + min((distances[i1], distances[i1 + 1],
distances_[-1])))
distances = distances_
return distances[-1]
答案 0 :(得分:1)
错误似乎出在return
函数的find_closest_word
语句中:
if distance < new_distance:
return i
该函数不会找到最接近的单词,它实际上会找到列表中比words[0]
更远的第一个单词。相反,请尝试遍历words
并跟踪哪个单词是到目前为止您看到的最好的单词。像这样:
best_distance = levenshtein_distance(word_0, words[0])
best_word = words[0]
for w in words:
d = levenshtein_distance(word_0, w)
if d < best_distance:
best_distance = d
best_word = w
return best_word