解压缩文本文件

时间:2016-07-12 13:32:26

标签: python

所以我现在已经压缩了我的文本,我需要将其解压缩才能重新创建文本。

压缩是:

import zlib, base64

text = raw_input("Enter a sentence: ")#Asks the user to input text
text = text.split()#Splits the sentence

uniquewords = [] #Creates an empty array 
for word in text: #Loop to do the following 
    if word not in uniquewords: #If the word is not in uniquewords
         uniquewords.append(word) #It adds the word to the empty array

positions = [uniquewords.index(word) for word in text] #Finds the positions of each uniqueword
positions2 = [x+1 for x in positions] #Adds 1 to each position
print ("The uniquewords and the positions of the words are: ") #Prints the uniquewords and positions
print uniquewords 
print positions2

file = open('task3file.txt', 'w')
file.write('\n'.join(uniquewords))#Adds the uniquewords to the file
file.write('\n')
file.write('\n'.join([str(p) for p in positions2]))
file.close()

file = open('compressedtext.txt', 'w')

text = ', '.join(text)

compression =  base64.b64encode(zlib.compress(text,9))

file.write('\n'.join(compression))

print compression

file.close()

我的减压尝试是:

import zlib, base64

text = ('compressedtext.txt')

file = open('compressedtext.txt', 'r')

print ("In the file is: \n") + file.read()

text = ''.join(text)
data = zlib.decompress(base64.b64decode(text))

recreated = " ".join([uniquewords[word] for word in positions]) #Recreates the sentence

file.close() #Closes the file

print ("The sentences recreated: \n") + recreated 

但是,当我运行解压缩并尝试重新创建原始文本时,会显示一条错误消息

文件“C:\ Python27 \ lib \ base64.py”,第77行,b64decode     提出TypeError(msg) TypeError:填充不正确

有谁知道如何修复此错误?

1 个答案:

答案 0 :(得分:2)

这里有一些事情发生。首先,让我给你一个工作样本:

import zlib, base64

rawtext = raw_input("Enter a sentence: ")  # Asks the user to input text
text = rawtext.split()  # Splits the sentence

uniquewords = []  # Creates an empty array
for word in text:  # Loop to do the following
    if word not in uniquewords:  # If the word is not in uniquewords
        uniquewords.append(word)  # It adds the word to the empty array

positions = [uniquewords.index(word) for word in text]  # Finds the positions of each uniqueword
positions2 = [x+1 for x in positions]  # Adds 1 to each position
print ("The uniquewords and the positions of the words are: ")  # Prints the uniquewords and positions
print uniquewords
print positions2

infile = open('task3file.txt', 'w')
infile.write('\n'.join(uniquewords))  # Adds the uniquewords to the file
infile.write('\n')
infile.write('\n'.join([str(p) for p in positions2]))
infile.close()

infile = open('compressedtext.b2', 'w')

compression = base64.b64encode(zlib.compress(rawtext, 9))

infile.write(compression)

print compression

infile.close()

# Now read it again

infile = open('compressedtext.b2', 'r')
text = infile.read()
print("In the file is: " + text)
recreated = zlib.decompress(base64.b64decode(text))
infile.close()
print("The sentences recreated:\n" + recreated)

我试图让你的东西与你的东西非常接近,但特别注意一些变化:

  • 我正在尝试更仔细地跟踪原始文本与已处理的文本 文本。

  • 我删除了zlib的重新定义。

  • 我删除了打破解压缩的额外换行符。

  • 我做了一些普通的清理工作,以便更好地符合普通的Python 约定。

希望这有帮助。