将文本分成256个字符的块

时间:2014-04-17 01:35:27

标签: python

您好我是编程新手,我知道的最基本的HTML。

我正在尝试将文本分成256个字符部分。根据我的学习,我应该使用

inFile = open('words.txt', 'r')

打开文本文件

contents = inFile.read()
print(contents)

然后我应该使用

str1 = file.read(256)

将此文字分组。

但我不明白如何使用这两个。

3 个答案:

答案 0 :(得分:3)

.read方法读取给定的字节数,如果没有指定数字则读取整个文件。要按字符而不是字节进行拆分,您应该读取整个文件,然后自己将它们分块。例如:

# This is just a convenience so you don't have to worry about closing the file
with open('words.txt', 'r') as inFile:
    # Read the file
    contents = inFile.read()
    # This will store the different 256 character bits
    groups = []
    # while the contents contain something
    while contents:
        # Add the first 256 characters to the grouping
        groups.append(contents[:256])
        # Set the contents to everything after the first 256
        contents = contents[256:]
   print(groups)

答案 1 :(得分:1)

或者,使用列表理解

with open('words.txt', 'r') as inFile:
    groups = [group for group in iter(lambda: inFile.read(256), '')]

<强>更新

如果words.txt包含非ascii代码且编码为utf-8

import codecs
with codecs.open('words.txt', 'r', 'utf-8') as inFile:
    groups = [group for group in iter(lambda: inFile.read(256), '')]

答案 2 :(得分:0)

我认为人们需要对那些刚接触编程的人更加友好。

inFile = open('words.txt', 'r')
contents = inFile.read() #Read the file from HDD and Set the whole content to MEMORY.

现在contents包含words.txt中的所有字符。

你可以得到这样的前256个字符。

str1 = contents[:256]    #Slice

你可以获得第二个256个字符。

str2 = contents[256:512] #Slice