这是一个程序,如果用户输入了拼写错误,则向用户建议玩家的姓名。这非常慢。
首先它必须发出一个get请求,然后检查玩家的名字是否在json数据中,如果是,则传递。否则,它需要所有玩家的名字和姓氏,并将其附加到li
。然后,它会使用names
检查first_name
和last_name
是否与列表中的名称非常相似。我从一开始就知道这将是非常缓慢的,但必须有一个更快的方法来做到这一点,只是我无法想出一个。有什么建议吗?
get_close_matches
答案 0 :(得分:1)
好吧,既然我在评论中得出了我的建议,我不妨将其作为答案发布,并附上其他一些想法。
首先,将您的I / O操作从功能中取出,这样您每次运行功能时都不会浪费时间发出请求。相反,当您启动脚本时,您将获得json并将其加载到本地内存中。如果可能的话,事先下载json数据,而不是打开文本文件可能是一个更快的选择。
其次,每个循环应该得到一组唯一的候选者,因为不需要多次比较它们。当get_close_matches()
丢弃名称时,我们知道不需要再次比较同名。 (如果丢弃名称的标准取决于后续名称,那将是一个不同的故事,但我怀疑这是在这种情况。)
第三,尝试使用批次。鉴于get_close_matches()
合理有效,与10个候选者相比,不应该比1更慢。但是将for
循环从超过100万个元素减少到超过100K个元素是非常显着的提升。
第四,我假设您正在检查last_name == ['LastName'] and first_name == ['FirstName']
,因为在那种情况下不会有拼写错误。那么为什么不简单地突破这个功能呢?
将它们放在一起,我可以编写一个如下所示的代码:
from difflib import get_close_matches
# I/O operation ONCE when the script is run
my_request = get_request("https://www.mysportsfeeds.com/api/feed/pull/nfl/2016-2017-regular/active_players.json")
# Creating batches of 10 names; this also happens only once
# As a result, the script might take longer to load but run faster.
# I'm sure there is a better way to create batches, but I'm don't know any.
batch = [] # This will contain 10 names.
names = [] # This will contain the batches.
for player in my_request['activeplayers']['playerentry']:
name = player['FirstName'] + " " + player['LastName']
batch.append(name)
# Obviously, if the number of names is not a multiple of 10, this won't work!
if len(batch) == 10:
names.append(batch)
batch = []
def suggest(first_name, last_name, names):
desired_name = first_name + " " + last_name
suggestions = []
for batch in names:
# Just print the name if there is no typo
# Alternatively, you can create a flat list of names outside of the function
# and see if the desired_name is in the list of names to immediately
# terminate the function. But I'm not sure which method is faster. It's
# a quick profiling task for you, though.
if desired_name in batch:
return desired_name
# This way, we only match with new candidates, 10 at a time.
best_matches = get_close_matches(desired_name, batch)
suggestions.append(best_matches)
# We need to flatten the list of suggestions to print.
# Alternatively, you could use a for loop to append in the first place.
suggestions = [name for batch in suggestions for name in batch]
return "did you mean " + ", ".join(suggestions) + "?"
print suggestion("mattthews ", "stafffford") #should return Matthew Stafford