我目前有一个脚本可以将文件分成多个输出文件,每个文件有两行。
example, original file:
AAA
BBB
CCC
DDD
EEE
输出文件为:
output1.txt and output2.txt etc.
AAA CCC
BBB DDD
我想知道如何让代码一次创建两个输出文件,一个像它一样接受两行,另一行包含文件中的所有其他内容,例如..
output1.txt rest1.txt output2.txt rest2.txt
AAA CCC CCC AAA
BBB DDD DDD BBB
EEE EEE
这是我到目前为止的代码,它是第一个例子:
splitLen = 2 # lines per file
outputBase = 'output' # output.1.txt, output.2.txt, etc.
input = open('file.txt', 'r')
count = 0
at = 0
dest = None
for line in input:
if count % splitLen == 0:
if dest: dest.close()
dest = open(outputBase + str(at) + '.txt', 'w')
at += 1
dest.write(line)
count += 1
谢谢!
答案 0 :(得分:1)
我的结构如下:
with open(infile) as f:
num_lines = sum(1 for line in f)
pairs = ((i,i+1) for i in range(0,num_lines-1,2))
for i,pair in enumerate(pairs):
with open('output{}'.format(i),'w') as op, \
open('rest{}'.format(i),'w') as rest, \
open(infile) as f:
for j, line in enumerate(f):
if j in pair:
op.write(line)
else:
rest.write(line)
首先,查找输入文件中有多少行。接下来,编写一个生成函数的生成函数(即(0,1)然后(2,3)然后...),它们对应于您在"输出"中所需的行。文件。从那里开始非常简单。
答案 1 :(得分:1)
只要文件不太大而无法放入内存,您就可以将输入文件转换为列表,然后使用切片操作来构建输出文件。
(编辑)更改显示值
splitLen = 2 # lines per file
outputBase = 'output%d.txt' # output.1.txt, output.2.txt, etc.
restBase = 'rest%d.txt'
with open('file.txt', 'r') as fp:
input_list = fp.readlines()
# to skip empties: input_list = [l for l in fp if l.strip()]
for i in range(0, len(input_list), splitLen):
iteration = (i/splitLen)
print 'iter', iteration
with open(outputBase % iteration, 'w') as fp:
fp.write(''.join(input_list[i:i+splitLen]))
with open(restBase % iteration, 'w') as fp:
fp.write(''.join(input_list[:i]))
fp.write(''.join(input_list[i+splitLen:]))