我正在处理一个不规则结构的文本文件,该结构由标题和不同部分的数据组成。我打算做的是遍历列表并在遇到某个角色后跳转到下一部分。我在下面做了一个简单的例子。处理这个问题的优雅方法是什么?
lines = ['a','b','c','$', 1, 2, 3]
for line in lines:
if line == '$':
print("FOUND END OF HEADER")
break
else:
print("Reading letters")
# Here, I start again, but I would like to continue with the actual
# state of the iterator, in order to only read the remaining elements.
for line in lines:
print("Reading numbers")
答案 0 :(得分:3)
通过使用内置函数iter
在for循环外创建行迭代器,实际上可以为两个循环创建一个迭代器。这样它将在第一个循环中部分耗尽,并在下一个循环中重复使用。
lines = ['a','b','c','$', 1, 2, 3]
iter_lines = iter(lines) # This creates and iterator on lines
for line in iter_lines :
if line == '$':
print("FOUND END OF HEADER")
break
else:
print("Reading letters")
for line in iter_lines:
print("Reading numbers")
以上打印此结果。
Reading letters
Reading letters
Reading letters
FOUND END OF HEADER
Reading numbers
Reading numbers
Reading numbers
答案 1 :(得分:1)
您可以使用enumerate
来跟踪迭代中的位置:
lines = ['a','b','c','$', 1, 2, 3]
for i, line in enumerate(lines):
if line == '$':
print("FOUND END OF HEADER")
break
else:
print("Reading letters")
print(lines[i+1:]) #prints [1,2,3]
但是,除非你真的需要处理标题部分,否则@EdChum简单地使用index
的想法可能更好。
答案 2 :(得分:0)
更简单的方式,也许更pythonic:
lines = ['a','b','c','$', 1, 2, 3]
print([i for i in lines[lines.index('$')+1:]])
# [1, 2, 3]
如果要将$
之后的每个元素读取到不同的变量,请尝试以下操作:
lines = ['a','b','c','$', 1, 2, 3]
a, b, c = [i for i in lines[lines.index('$')+1:]]
print(a, b, c)
# 1 2 3
或者,如果您不知道$
后面有多少元素,您可以这样做:
lines = ['a','b','c','$', 1, 2, 3, 4, 5, 6]
a, *b = [i for i in lines[lines.index('$')+1:]]
print(a, *b)
# 1 2 3 4 5 6
答案 3 :(得分:0)
如果您有更多这种分隔符,最通用的解决方案是构建一个小型状态机来解析您的数据:
def state0(line):
pass # processing function for state0
def state1(line):
pass # processing function for state1
# and so on...
states = (state0, state1, ...) # tuple grouping all processing functions
separators = {'$':1, '#':2, ...} # linking separators and states
state = 0 # initial state
for line in text:
if line in separators:
print('Found separator', line)
state = separators[line] # change state
else:
states[state](line) # process line with associated function
该解决方案能够以任意顺序正确地处理任意数量的分隔符,并具有任意重复次数。唯一的限制是给定的分隔符始终跟随相同类型的数据,可以通过其关联的函数进行处理。