我需要创建一个读取四行的脚本,如果满足条件,则读取文件中的后四行,依此类推。如果不满足条件,脚本必须从先前读取的块的第二行重新开始测试。因此,下一个块的第一行成为新的第四行。例如,我想从以下文件中检索总和为4的所有块。
printf "1\n1\n1\n1\n2\n1\n1\n1\n1" > file1.txt #In BASH
从1到4的行总和4,因此它们产生了积极的结果。从5到8的行总和为5,因此它们产生负结果,并且总和必须从第6行开始重做并在第9行结束,总和为4,因此产生正结果。我知道我可以做这样的事情,
with open("file1.txt") as infile:
while not EOF:
lines = []
for i in range(next N lines):
lines.append(infile.readline())
make_the_sum(lines)
但这会使读者移动四行,如果总和大于4,则无法向后移动。如何实现这种效果?考虑到我的文件很大,我无法将它们整个加载到内存中。
答案 0 :(得分:0)
我通过忽略文件结束问题来简化。您可以使用tell和seek来处理恢复较早的位置(您可以在列表中保存尽可能多的位置,例如:
>>> with open('testmedium.txt') as infile:
... times = 0
... EOF = 0
... while not EOF:
... pos = infile.tell()
... print(f"\nPosition is {pos}")
... lines = []
... for i in range(4):
... lines.append(infile.readline())
... [print(l[:20]) for l in lines]
... if times==0 and '902' in lines[0]:
... times = 1
... infile.seek(pos)
... elif '902' in lines[0]:
... break
Position is 0
271,848,690,44,511,5
132,427,793,452,85,6
62,617,183,843,456,3
668,694,659,691,242,
Position is 125
902,550,177,290,828,
326,603,623,79,803,5
803,949,551,947,71,8
661,881,124,382,126,
Position is 125
902,550,177,290,828,
326,603,623,79,803,5
803,949,551,947,71,8
661,881,124,382,126,
>>>
答案 1 :(得分:0)
以下代码会将行读入"缓存" (只是一个列表)并在缓存有四行时对缓存行进行一些操作。如果测试通过,则缓存将被清除。如果测试失败,则更新缓存以仅包含缓存的最后三行。您可以根据需要在if-else块中执行其他工作。
def passes_test(lines, target_value=4):
return sum([int(line) for line in lines]) == target_value
with open('file1.txt') as f:
cached = []
for line in f:
cached.append(line)
if len(cached) == 4:
if passes_test(cached):
cached = []
else:
cached = cached[1:]
答案 2 :(得分:0)
正如Martijn所说,
with open("file1.txt") as f:
rd = lambda: int(next(f))
try:
a, b, c, d = rd(), rd(), rd(), rd()
if a + b + c + d == 4:
# found a block
a, b, c, d = rd(), rd(), rd(), rd()
else:
# nope
a, b, c, d = b, c, d, rd()
except StopIteration:
# found end of file