如何在python的帮助下拆分一个巨大的txt文件

时间:2013-10-30 05:57:26

标签: python

我有一个巨大的文本文件(models.txt)并包含如下所示的行:

Model 1
text
text
text
text
END

Model 2
text
text
text
text
END

Model 3
text
text
text
text
END

我想编写一个函数,以便它可以将“模型1”,“模型2”和“模型3”作为起点,将“结束”作为结束点并写出放置文件model_1.txt,model_2各个块的.txt和Model_3.txt

因为我不太了解编程所以我写这个

a = open('C:/Users/Zebrafish/Desktop/AHR_human_modeling/human/edited/1AHH.B99990013.pdb','r')
lines = a.readlines()

x = 1

for line in lines:
    if 'END' in line:
        PDB_file = open('C:/Users/Zebrafish/Desktop/AHR_human_modeling/human/edited/model_1.pdb','w')
        PDB_file.write(line)
        PDB_file.close()

2 个答案:

答案 0 :(得分:4)

from itertools import groupby
with open('infile') as f:
    groups = groupby(f, key=str.isspace)
    for k, lines in groups:
        if k:
            continue
        fname = next(lines).strip().lower().replace(' ', '_')+'.txt'
        with open(fname, 'w') as outf:
            outf.writelines(lines)

答案 1 :(得分:0)

如果您的文件适合内存,那么您可以使用正则表达式拆分文件,然后迭代匹配:

with open('models.txt') as handle:
    models = re.findall("Model.*?END", handle.read(), re.MULTILINE|re.DOTALL)
    for i, model in enumerate(models):
        with open('model_%s.txt' % i) as output:
            output.write(model)