如何将预先存在的文本文件编码为单独文件中的utf-8?

时间:2017-05-20 23:50:25

标签: python python-2.7 utf-8 encode

我尝试编码预先存在的文本文件并将其写入utf-8。我已经制作了一个菜单,要求用户输入他们想要编码的文本文件,但之后我绝对丢失了。我正在查看之前的帖子,并将该代码合并到我的代码中,但我不确定它是如何工作的或我正在做什么。

非常感谢任何帮助!

import codecs

def getMenuSelection():
    print "\n"
    print "\t\tWhich of the following files would you like to encode?"
    print "\n"
    print "\t\t================================================"
    print "\t\t1. hamletQuote.txt"
    print "\t\t2. RandomQuote.txt"
    print "\t\t3. WeWillRockYou.txt"    
    print "\t\t================================================"
    print "\t\tq or Q to quit"
    print "\t\t================================================"

    print ""

    selection = raw_input("\t\t")
    return selection

again = True

while (again == True):

    choice = getMenuSelection()

    if choice.lower() == 1 :

        with codecs.open(hamletQuote.txt,'r',encoding='utf8') as f:
            text = f.read()

        with codecs.open(hamletQuote.txt,'w',encoding='utf8') as f:
            f.write(text)

    if choice.lower() == 2 :

        with codecs.open(RandomQuote.txt,'r',encoding='utf8') as f:
            text = f.read()

        with codecs.open(RandomQuote.txt,'w',encoding='utf8') as f:
            f.write(text)

    if choice.lower() == 3 :

        with codecs.open(WeWillRockYou.txt,'r',encoding='utf8') as f:
            text = f.read()

        with codecs.open(WeWillRockYou.txt,'w',encoding='utf8') as f:
            f.write(text)

    elif choice.lower() == "q":
        again = False

2 个答案:

答案 0 :(得分:1)

您的代码将正常工作,但您需要创建文件名字符串。您的输入文件名也与输出文件名相同,因此输入文件将被覆盖。您可以通过将输出文件命名为不同的东西来解决此问题:

codecs.open

如果你好奇它是如何工作的,r会在给定模式下打开一个编码文件;在这种情况下w表示读取模式。 f指的是写模式。 read()是指文件对象,它有多种方法,包括write()with(您使用过的)。

使用with语句时,它简化了打开文件的过程。它确保始终使用清理。如果没有f.close()块,则必须在完成文件处理后指定COUNT(B)

答案 1 :(得分:0)

为什么不使用常规open语句并将文件作为二进制文件打开并将编码后的文本写入utf-8,您需要将文件作为常规读取模式打开,因为它& #39;未编码:

with open("hamletQuote.txt", 'r') as read_file:
    text = read_file.read()

with open("hamletQuote.txt", 'wb') as write_file:
    write_file.write(text.encode("utf-8"))

但如果你坚持使用codecs,你可以这样做:

with codecs.open("hamletQuote.txt", 'r') as read_file:
    text = read_file.read()

with codecs.open("hamletQuote.txt", 'wb', encoding="utf-8") as write_file:
    write_file.write(text.encode("utf-8"))