有没有办法让python文件读取分区文本文件?

时间:2018-04-09 02:35:43

标签: python python-3.x

我想阅读以这种方式格式化的文本文件

1      100
---stuff----
2      100
---stuff---
3      200
---stuff--

1表示案例ID,100表示​​行数" stuff"占据。有没有办法让我在python中分别阅读1 100和2 100?

2 个答案:

答案 0 :(得分:0)

档案结构: (请注意,数字行是制表符分隔的)

1   3
abc
def
ghi
2   2
jkl
mno
3   4
pqr
stu
vwx
yz

现在尝试:

f=open(filename)
all_lines=f.readlines() #read all lines

content=[] #empty list 

for i in range(len(all_lines)): #for each line
    if(len(all_lines[i].split('\t'))==2): #check if when you split by tab the line has two members only
        i=i+1
        c=[] #for the current segment
        while(i<len(all_lines) and len(all_lines[i].split('\t'))!=2): #until next segment is reached
            c.append(all_lines[i].strip()) #append to current segment
            i=i+1
        content.append(c) #append entire current segment to overall content

for c in content:
    print(c)

输出:

['abc', 'def', 'ghi']
['jkl', 'mno']
['pqr', 'stu', 'vwx', 'yz']

答案 1 :(得分:0)

您可以简单地尝试生成器方法:

with open('file','r') as f:
    def generator_approach():
        sub_=[]
        for line in f:

            if 'stuff' in line.strip():
                yield sub_
                sub_=[]
            else:
                sub_.append(line.strip())
        if sub_:
            yield sub_

    closure_=generator_approach()
    print(list(closure_))

输出:

[['1      100'], ['2      100'], ['3      200']]