在python中读取和分组文本文件的内容

时间:2018-12-02 01:57:31

标签: python

我有一个要在Python中读取的文本文件

内容

line1
line2

line3
line4
line5

line6

....

阅读:

with open(path, encoding="utf8", errors='ignore') as f1:
   contents = f1.readlines()
   print (contents)

OP:

[line1, line2,.... line6]

但是我想根据分隔行的空白读取内容。

预期的操作次数:

[[line1, line2], [line3,line4,line5], [line6]]

是否有比通过列表迭代遍历文件的整个内容然后根据空白分组更短的方法。关于该方法有什么建议吗?

1 个答案:

答案 0 :(得分:3)

类似的事情应该可以满足您的需求:

In [8]: result = []

In [9]: with open(path, encoding="utf8", errors='ignore') as fh:
   ...:     group = []
   ...:     for l in fh:
   ...:         l = l.strip()
   ...:         if not l:
   ...:             result.append(group)
   ...:             group = []
   ...:         else:
   ...:             group.append(l)
   ...:     if group:
   ...:         result.append(group)
   ...:

In [10]: result
Out[10]: [['line1', 'line2'], ['line3', 'line4', 'line5'], ['line6']]

或使用itertools groupby

的另一个(不太可读)的单线版本
from itertools import groupby    
[g for g in [list(g) for _, g in groupby(open(path).read().splitlines(), lambda l: bool(l.strip()))] if all(g)]