解析Python中的换行符

时间:2013-01-14 17:02:00

标签: python parsing file-io newline

我正在开发一个相当基本的编码器/解码器,您可以在其中输入自己的文本文件(作为字符串)和您自己的编码器(也作为字符串:它必须是文本文件)。

这是我的解码器功能:

def cDecode(file_name, encoder='standard_encoder.txt', save_new=True): # does not decode multi-lines correctly -- everything goes on a single line. See next comment
    '''Decodes <'file_name'> with the reverse method of <'encoder'>.'''
    if type(file_name) != str or type(encoder) != str: raise TypeError("<'file_name'> and <'encoder'> must be of type <'str'>.")
    if type(save_new) != bool: raise TypeError("<'save_new'> must be of type <'bool'>.")
    if file_name[-4:] != '.txt': file_name += '.txt'
    if encoder[-4:] != '.txt': encoder += '.txt'
    decoder_set = {}
    try:
        with open(encoder, 'r') as encoding_file:
            for line in encoding_file:
                line_parts = line.split(': ')
                my_key, my_value = line_parts[1], line_parts[0]

我认为错误在这里: 我必须删除'\ n',因为每个字符(在解码文件中)都在一个新行上,如:'A:Ð'。

                if '\n' in my_key:
                        loc = my_key.find('\n') # this may be the cause of the single-line of the decoding.
                        my_key = my_key[:loc] + my_key[loc + 1:]
                decoder_set[my_key] = my_value
        encoding_file.close()
    except IOError:
        encoder = 'standard_encoder.txt'
        with open(encoder, 'r') as encoding_file:
            for line in encoding_file:
                line_parts = line.split(': ')
                my_key, my_value = line_parts[1], line_parts[0]
                # every key has a new line character automatically because it's on a different line
                if '\n' in my_key:
                        loc = my_key.find('\n')
                        my_key = my_key[:loc] + my_key[loc + 1:]
                decoder_set[my_key] = my_value
        encoding_file.close()
    decodingKeys = decoder_set.keys()

这是函数的其余部分:

    if save_new:
        try:
            decoded_file_name = file_name[:-12] + '_decoded' + file_name[-4:]
            encoded_file = open(decoded_file_name, 'a+')
            with open(file_name, 'r') as my_file:
                for line in my_file:
                    de_line = ''
                    for char in line:
                        if char in decodingKeys: de_char = decoder_set[char]
                        else: de_char = char
                        de_line += de_char
                    encoded_file.write(de_line)
        except IOError:
            raise NameError(file_name + ' was not found. Decoding process terminated.')
    else:
        try:
            import os
            encoded_file = file_name[:-12] + '_decoded' + file_name[-4:]
            with open(file_name, 'r+') as my_file:
                for line in my_file:
                    de_line = ''
                    for char in line:
                        if char in decodingKeys: en_char = decoding_set[char]
                        else: de_char = char
                        de_line += de_char
                    encoded_file.write(de_line)
                os.remove(file_name)
                os.rename(encoded_file, file_name)
        except IOError:
            raise NameError(file_name + ' was not found. Decoding process terminated.')

假设我有一个多行文本文件:

This is a test.
As is this one.
Good bye!

编码然后解码后,它会显示如下:This is a test.As is this one.Good bye!

我该如何解决这个问题?我希望它能显示出来:

This is a test.
As is this one.
Good bye!

谢谢!

1 个答案:

答案 0 :(得分:2)

在将该行写回文件时添加'\n'

encoded_file.write(de_line+'\n')