使用字典来纠正拼写

时间:2016-11-15 05:50:04

标签: python dictionary

我需要编写一个带有字符串参数的函数,并根据这个字典返回一个带有更正拼写的字符串:

例如:

"I ate teh whole thing lol"

出现了:

'I ate the whole thing haha' 

到目前为止,我已经完成了这项工作,但我不知道该做什么:

def respell(string):
    respellings = {
        "teh":"the",
        "relevent":"relevant",
        "lite": "light",
        "lol":"haha" }
    respellingslist = reslepllings.split()

5 个答案:

答案 0 :(得分:0)

试试这个:)

def respelling(string):
    respellings = {
        "teh": "the",
        "relevent": "relevant",
        "lite": "light",
        "lol": "haha" }

    res = string.split()
    for i in xrange(len(res)):
        if res[i] in respellings:
            res[i] = respellings[res[i]]
    return ' '.join(res)

[编辑]单行:

return ' '.join(map(lambda s: respellings.get(s, s), string.split()))

答案 1 :(得分:-1)

假设respellings包含许多可能错误的单词作为键:

def respell(s):
    respellings = {
        "teh":"the",
        "relevent":"relevant",
        "lite": "light",
        "lol":"haha" }
    if s in respellings:
        return respellings[s]

如果它不包含所需的密钥,它将不执行任何操作。

答案 2 :(得分:-1)

以下函数接受一个字符串并更正作为参数传递的respellings字典中指定的所有拼写错误:

def respell(s, respellings):
    for wrong in respellings:
        try:
            index = s.index(wrong)
            s = s[:index] + respellings[wrong] + s[len(wrong)+index:]
        except:
            print(wrong + " not in string")
    return s


>>> print(respell("I aet teh wohle tingh",
                 {"aet":"ate", "teh":"the", "wohle":"whole", "tingh":"thing"}))
"I ate the whole thing"

答案 3 :(得分:-1)

尝试:

def respell(s):
    respellings = {
        'teh': 'the',
        'relevent': 'relevant',
        'lite': 'light',
        'lol': 'haha'
    }
    for key in respellings:
        s = s.replace(key, respellings[key])
    return s

答案 4 :(得分:-1)

如果您只想使用字典来纠正拼写,并且所有错误的拼写都作为键存在,那么您的代码应如下所示

def respell(word):
   respellings = {
      "teh":"the",
      "relevent":"relevant",
      "lite": "light",
      "lol":"haha" }
   try:
       return respellings[word]
   except KeyError:
       return word

string = "I ate teh whole thing lol"
correct_string = " ".join(respell(word) for word in string.split())

如果您想要进行适当的拼写检查(如果您需要稍微贵一点),请查看以下内容。

from difflib import SequenceMatcher

def similar(a, b):
    return SequenceMatcher(None, a, b).ratio()

def respell(word):
    known_words = set(["the","man","relevant","light","haha"])
    if word in known_words:
         return word,1
    max_similarity = 0
    correct_word = None
    for known_word in known_words:
        similarity_value = similar(known_word, word)
        if max_similarity<similarity_value:
             similarity_value = max_similarity
             correct_word = known_word
    return correct_word,max_similarity

上述函数返回一个单词的两个值和0到1之间的相似度值。如果没有单词甚至接近相似则返回零(0),如果给定单词​​是已知或正确单词,则返回1。