如何在Python中替换字符串中的单词?

时间:2013-04-04 21:24:20

标签: python translation

我目前正在尝试编写一个可以反映陈述的基本“AI”。例如,如果用户输入“我的猫是棕色的”。 AI会回复“你的猫是棕色的吗?”

我写了一个可变数据表,我打算用它来翻译:

reflections = \
[["I",          "you"],
 ["i",          "you"],
 ["We",         "you"],
 ["we",         "you"],
 ["We're",      "you're"],
 ["we're",      "you're"],
 ["I'm",        "you're"],
 ["i'm",        "you're"],
 ["im",         "you're"],
 ["this",       "that"],
 ["This",       "that"],
 ["am",         "are"],
 ["Am",         "are"],
 ["My",         "your"],
 ["my",         "your"],
 ["you",        "I"], # Grammar: Sometimes "me" is better
 ["You",        "I"],
 ["u",          "me"],
 ["I'd",        "you'd"],
 ["I'll",       "you'll"],
 ["We'd",       "you'd"],
 ["we'd",       "you'd"],
 ["We'll",      "you'll"],
 ["we'll",      "you'll"],
 ["You're",     "I'm"],
 ["you're",     "I'm"],
 ["ur",         "I'm"],
 ["c",          "see"],
 ["I've",       "you've"],
 ["We've",      "you've"],
 ["we've",      "you've"],
 ["Our",        "your"],
 ["our",        "your"],
 ["was",        "were"],
 ["Was",        "were"],
 ["were",       "was"],
 ["Were",       "was"],
 ["me",         "you"],
 ["your",       "my"],
 ["Your",       "my"]]

但是,我在实施数据方面遇到了一些麻烦。

我目前对字符串反射的定义是:

from string import maketrans

intab = ".!"
outtab = "??"
translate_message = maketrans(intab, outtab) #used to replace punctuation

def reflect_statement(message):
    if ' ' not in message:
        if len(message) == 0:
            return elicitations[0]
        if len(message) == 1:
            return elicitations[1]
        if len(message) == 2:
            return elicitations[2]
        if len(message) == 3:
            return elicitations[3]
        if len(message) == 4:
            return elicitations[4]
        if len(message) == 5:
            return elicitations[5]
        if len(message) == 6:
            return elicitations[6]
        if len(message) == 7:
            return elicitations[7]
        if len(message) == 8:
            return elicitations[8]
        if len(message) == 9:
            return elicitations[9]
        if len(message) == 10:
            return elicitations[10]
        if len(message) > 10:
            return elicitations[11]
    if ' ' in message:
        message = message.translate(translate_message)
        return message

忽略elicitations引用,这是我已经完成的程序的一个单独部分。

我真的很感激有人能给我提供任何帮助。

干杯!

2 个答案:

答案 0 :(得分:1)

你真正想做的是定义一个语法,这样你就可以挑出代词,然后只替换代词......

基本上对于这样的事情你无论如何都需要创建一个语法......或者你的AI永远无法解决任何问题

看看Neds List Of PyParsers ......我倾向于喜欢ply

答案 1 :(得分:1)

if ' ' not in message:
    if len(message) < 11:
        return elicitations[len(message)]
    else:
        return elicitations[11]
else:
    for pair in reflections:
        message = message.replace(*pair)

注意:

  1. 这将替换部分单词。要进行适当的替换,您需要仔细制作正则表达式。也许r'\b{}\b'.format(oldword)之类的东西可行,但我不确定:

    import re
    for old, new in reflections:
        message = re.sub(r'\b{}\b'.format(old), new, message)
    
  2. 嵌套列表不是最合乎逻辑的数据结构,请考虑使用字典。