Question

我很惭愧再次求助，但我被困住了。

我有一本西班牙小说（纯文本），我有一个Python脚本，它应该在括号中使用自定义字典在另一个文本文件中放置难以翻译的单词的翻译。

经过大量的反复试验，我已经设法让脚本运行，然后将小说写成新的文本文件，就像它应该做的那样。

唯一的问题是，小说中的文字没有变化，也就是说，翻译没有插入到文本中。字典是纯文本文件，格式如下：

[spanish word] [english translation]                                      
[spanish word] [english translation]

等等。请注意，这些单词并未真正括在括号中。每个单词之间只有一个空格，文件中的其他地方没有空格。

以下是有问题的代码：

bookin = (open("novel.txt")).read()
subin = open("dictionary.txt")
for line in subin.readlines():
    ogword, meaning = line.split(" ")
    subword = ogword + "(meaning)"
    bookin.replace(ogword, subword)
    ogword = ogword.capitalize()
    subword = ogword + "(meaning)"
    bookin.replace(ogword, subword)
subin.close()
bookout = open("output.txt", "w")
bookout.write(bookin)
bookout.close()

建议将不胜感激。

编辑：现在解决了MemoryError，我认为我修复过的字典中有错误。非常感谢那些帮助我解决这个愚蠢问题的人！

Answer 1

变化：

bookin.replace(ogword, subword)

到

bookin = bookin.replace(ogword, subword)

说明：replace不会更改字符串 - 事实上，字符串是不可变的 - 而是返回一个新版本。

Answer 2

@David Robinson指出问题是你使用替换。它应该是

 bookin = bookin.replace(ogwrd, subword)

昨晚，当你发布你的问题时，我很高兴（我对两者都赞不绝口问题和答案 - 我没有及时发布我自己），但问题一直困扰着我。即使答案已经发布并被接受，我也是我想提供以下建议 - 如果可以，我相信生成如上所示的代码，你很可能会发现最多你自己的问题来源。

我在这些问题上的建议是创造一个小问题数据文件，比如10条记录/行，并使用它来跟踪数据你的程序通过一些诊断打印报表加油。一世我正在展示以下版本。它没有完全完成，但我希望意图明确。

基本思想是验证您期望发生的一切通过查看“调试打印语句”生成的输出，实际发生在每一步。在这种情况下，你会看到 bookin未被修改。

bookin = (open("novel.txt")).read()
subin = open("dictionary.txt")

print 'bookin =', bookin  # verify that you read the information 

for line in subin.readlines():
    print 'line = ', line # verify line read

    ogword, meaning = line.split(" ")
    print 'ogword, meaning = ', ogword, meaning # verify ...

    subword = ogword + "(meaning)"
    print 'subword =', subword # verify ...

    bookin.replace(ogword, subword)
    print 'bookin post replace =', bookin # verify ... etc

    ogword = ogword.capitalize()
    subword = ogword + "(meaning)"
    bookin.replace(ogword, subword)

subin.close() 

print 'bookout', bookout # make sure final output is good ...
bookout = open("output.txt", "w")
bookout.write(bookin)
bookout.close()

最后，Python与其他语言相比还有一个额外的好处就是你可以使用交互式地。我最终经常做的是验证我的理解解释器中的功能和行为（我经常这样做懒得看文档 - 这实际上并不好事情）。所以，在你的情况下，因为问题是替换（我的调试打印语句会向我显示）我会在解释器中尝试了以下序列

 s = 'this is a test'
 print s
 s.replace('this', 'that')
 print s

并且会看到s没有改变，在这种情况下我会有查看了文档，或者只是尝试了s = s.replace('this', 'that')。

我希望这会有所帮助。这种基本的调试技术通常可以帮助查明问题区域，并且是一个很好的第一步。下线调试器等非常有用。

PS：我是SO的新手，所以我希望这种额外的答案不是不高兴。

Answer 3

您可以在解释器中输入以下信息时获取此信息：

>>> help(str.replace)  
>>> help('a'.replace)  
>>> s = 'a'  
>>> help(s.replace)  
>>> import string  
>>> help(string.replace)

Answer 4

除了MemoryError，这是惊人的，考虑到文件的大小，你还有几件可以改进的东西;见下面的评论：

bookin = open("novel.txt").read() # don't need extra ()
subin = open("dictionary.txt")
# for line in subin.readlines():
# readlines() reads the whole file, you don't need that
for line in subin:
    # ogword, meaning = line.split(" ")
    # the above will leave a newline on the end of "meaning"
    ogword, meaning = line.split()
    # subword = ogword + "(meaning)"
    # if ogword is "gato" and meaning is "cat",
    # you want "gato (cat)"
    # but you will get "gato(meaning)"
    subword = ogword + " (" + meaning + ")"
    bookin = bookin.replace(ogword, subword)
    ogword = ogword.capitalize()
    subword = ogword + "(meaning)"  # fix this also
    bookin.replace(ogword, subword) # fix this also
    print len(bookin) # help debug your MemoryError
subin.close()
bookout = open("output.txt", "w")
bookout.write(bookin)
bookout.close()

您需要遵循@Levon的建议并在一些小型测试数据文件上尝试您的代码，以便您可以看到正在发生的事情。

使用这个单行词典后：

gato cat

这部单行小说：

El gato se sirvió un poco de Gatorade para el "alligator".

您可能希望重新考虑您的高级策略。

Python脚本将文本写入文件，但不添加它应该的文本

4 个答案: