Question

我有这个将文本语言转换成英语的功能：

def translate(string):
    textDict={'y':'why', 'r':'are', "l8":'late', 'u':'you', 'gtg':'got to go',
        'lol': 'laugh out    loud', 'ur': 'your',}
    translatestring = ''
    for word in string.split(' '):
        if word in textDict:
            translatestring = translatestring + textDict[word]
        else:
            translatestring = translatestring + word
    return translatestring

但是，如果我想翻译y u l8?，则会返回whyyoul8?。当我退回时，我将如何分离单词，以及如何处理标点符号？任何帮助表示赞赏！

Answer 1

oneliner comprehension：

''.join(textDict.get(word, word) for word in re.findall('\w+|\W+', string))

[编辑]修正了正则表达式。

Answer 2

您在没有空格的字符串中添加单词。如果你打算以这种方式做事（而不是你在上一个关于这个主题的问题中建议的方式），你需要手动重新添加空格，因为你拆分它们。

Answer 3

“y u l8”分裂为“”，给出[“y”，“u”，“l8”]。在替换之后，你会得到[“为什么”，“你”，“迟到”] - 并且你在没有添加空格的情况下连接它们，所以你得到“whyyoulate”。 if的两个分叉都应该插入一个空格。

Answer 4

您只需添加+ ' ' +即可添加空格。但是，我认为你要做的是：

import re

def translate_string(str):
    textDict={'y':'why', 'r':'are', "l8":'late', 'u':'you', 'gtg':'got to go',  'lol': 'laugh out loud', 'ur': 'your',}
    translatestring = ''
    for word in re.split('([^\w])*', str):
        if word in textDict:
            translatestring += textDict[word]
        else:
            translatestring += word

    return translatestring


print translate_string('y u l8?')

这将打印：

why you late?

此代码可以更优雅地处理问号等内容，并保留输入字符串中的空格和其他字符，同时保留原始意图。

Answer 5

我想建议以下替换此循环：

for word in string.split(' '):
    if word in textDict:
        translatestring = translatestring + textDict[word]
    else:
        translatestring = translatestring + word

表示string.split（''）中的单词： translatetring + = textDict.get（word，word）

dict.get(foo, default)将在字典中查找foo，如果尚未定义default，则使用foo。

（运行时间，现在简短说明：拆分时，您可以根据标点符号和空格进行拆分，保存标点符号或空格，并在加入输出字符串时重新引入它。这是一项更多的工作，但它将完成工作。）

在python中使用whitespeace /标点符号的小问题？

5 个答案: