Question

我将向您详细解释我想要实现的目标我有两个关于词典的程序程序1的代码在这里：

import re
words = {'i':'jeg','am':'er','happy':'glad'}

text = "I am happy.".split()
translation = []

for word in text:
    word_mod = re.sub('[^a-z0-9]', '', word.lower())
    punctuation = word[-1] if word[-1].lower() != word_mod[-1] else ''
    if word_mod in words:
        translation.append(words[word_mod] + punctuation)
    else:
        translation.append(word)
translation = ' '.join(translation).split('. ')
print('. '.join(s.capitalize() for s in translation))

该计划具有以下优势：

你可以写一个以上的句子
你在第一个字母“。”之后得到大写字母。
程序将未翻译的单词“追加”到输出（“translation = []”）

以下是程序2的代码：

words = {('i',): 'jeg', ('read',): 'leste', ('the', 'book'): 'boka'}
max_group = len(max(words))

text = "I read the book".lower().split()
translation = []

position = 0
while text:
    for m in range(max_group - 1, -1, -1):
        word_mod = tuple(text[:position + m])
        if word_mod in words:
            translation.append(words[word_mod])
            text = text[position + m:]
    position += 1

translation = ' '.join(translation).split('. ')
print('. '.join(s.capitalize() for s in translation))

使用此代码，您可以翻译惯用语句或
“书”到“boka” 以下是程序如何处理代码这是输出：

1  
('i',)  
['jeg']  
['read', 'the', 'book']  
0  
()  
1  
('read', 'the')  
0  
('read',)  
['jeg', 'leste']  
['the', 'book']  
1  
('the', 'book')  
['jeg', 'leste', 'boka']  
[]  
0  
()  
Jeg leste boka

我想要的是将程序1中的一些代码实现到程序2中我多次尝试但都没有成功...... 这是我的梦想......：
如果我将文本更改为以下内容......：

text = "I read the book. I read the book! I read the book? I read the book.".lower().split()

我希望输出为：

Jeg leste boka. Jeg leste boka! Jeg leste boka? Jeg leste boka.

所以，请调整你的大脑，帮助我解决问题...... 我非常感谢任何回复！非常感谢你提前！

Answer 1

我的解决方案流程是这样的：

dict = ...
max_group = len(max(dict))
input = ...
textWPunc = input.lower().split()
textOnly = [re.sub('[^a-z0-9]', '', x) for x in input.lower().split()]
translation = []

while textOnly:
    for m in [max_group..0]:
        if textOnly[:m] in words:
            check for punctuation here using textWPunc[:m]
            if punctuation present in textOnly[:m]:
                Append translated words + punctuation
            else:
                Append only translated words
            textOnly = textOnly[m:]
            textWPunc = textWPunc[m:]

join translation to finish

关键部分是你保留两行平行的文字，一行检查要翻译的单词，另一行检查标点符号，如果你的翻译搜索出现了点击。为了检查标点符号，我将我正在检查的单词组添加到re（）中，如下所示：re.sub('[a-z0-9]', '', wordGroup)将删除所有字符但没有标点符号。

最后一点是，使用位置变量，您的索引对我来说有点奇怪。既然你要截断源字符串，我不确定这是否真的有必要。只需检查最左边的x个单词，而不是使用该位置变量。

嵌套循环的问题......

1 个答案: