将文本分成10个列表的高效/优雅方式

时间:2012-05-22 05:19:26

标签: python list parsing

是否有更有效的方法来执行下面的操作?我非常想认为有。这个脚本没有特别的用处,但是知道一种更有效的方法来做这件事仍然是非常好的。

# Divides text into 10 lists.. you'll see what I mean.
# 5/21/2012

filename = "test.txt"

FILE = open(filename,"r")

# READ FILE WORD-BY-WORD:
f = open(filename,"r")
lines = f.readlines()
for i in lines:
    thisline = i.split(" ")

FILE.close()

# DIVIDE INTO 10 LISTS:
list1 = []
list2 = []
list3 = []
list4 = []
list5 = []
list6 = []
list7 = []
list8 = []
list9 = []
list10 = []

j = 0
while j < len(thisline):
    x = thisline[j]
    list1.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list2.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list3.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list4.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list5.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list6.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list7.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list8.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list9.append(x)
    j+=1
    if j >= len(thisline):
        break

    x = thisline[j]
    list10.append(x)
    j+=1
    if j >= len(thisline):
        break

print "list 1 = "," ".join(list1)
print "list 2 = "," ".join(list2)
print "list 3 = "," ".join(list3)
print "list 4 = "," ".join(list4)
print "list 5 = "," ".join(list5)
print "list 6 = "," ".join(list6)
print "list 7 = "," ".join(list7)
print "list 8 = "," ".join(list8)
print "list 9 = "," ".join(list9)
print "list 10 = "," ".join(list10)

# EOF

3 个答案:

答案 0 :(得分:3)

如果这个程序的目的是读取文件的行,将它们分成单词然后将单词附加到列表中,使得第N个列表包含从第N个开始的每10个单词,那么以下是什么:

from itertools import izip, cycle

filename = "test.txt"
f = open(filename,"r")

lsts = list([] for _ in range(10))
oracle = cycle(lsts)

for line in f:
    parts = line.split(" ")

    for lst, part in izip(oracle, parts):
        lst.append(part)

f.close()

for index, lst in enumerate(lsts):
    print "list %u = " % (index+1,)," ".join(lst)

答案 1 :(得分:1)

这会使用None填充较短的列表,如果您愿意,可以轻松过滤掉

with open("test.txt") as f:
    result = zip(*map(None, *[(word for line in f for word in line.split())]*10))

答案 2 :(得分:0)

如果您有一个包含十行输入的文件,则可以使用:

line1, line2, line3, line4, line5, line6, line7, line8, line9, line10 = open('test.txt','r').readlines()

您可以将open().readlines()调用替换为执行所需拆分的任何功能。向我们展示输入文件的示例将有助于此。

但是,为什么在使用索引序列时可以使用10个变量?

lines = open('test.txt', 'r').readlines()
assert line1 == lines[0] # etc