我有一个包含这样数据的文件:
# 0 867.691994 855.172889 279.230411 -78.951239 55.994189 -164.824148
# 0 872.477810 854.828159 279.690170 -78.950558 55.994391 -164.823700
...
893.270609 1092.179289 184.692319
907.682255 1048.809187 112.538457
...
# 0 877.347791 854.481104 280.214892 -78.949869 55.994596 -164.823240
...
893.243290 1091.395104 184.726720
907.682255 1048.809187 112.538457
...
# 0 882.216053 854.135168 280.745489 -78.948443 55.996206 -164.821887
我想通过以下方式只读取注释行之间的行:我将两个相邻注释之间的所有行读入某个数组(不保存到文件中),并使用它,然后读取下一个块进入数组,等等。
我设法让它读一个街区:
def main():
sourceFile = 'test.asc'
print 'Extracting points ...'
extF = open(sourceFile, 'r')
block, cursPos = readBlock(extF)
extF.close()
print 'Finished extraction'
def readBlock(extF):
countPnts = 0
extBlock = []
line = extF.readline()
while not line.startswith('#'):
extPnt = Point(*[float(j) for j in line.split()])
countPnts += 1
extBlock.append(extPnt)
line = extF.readline()
cursPos = extF.tell()
print 'Points:', countPnts
print 'Cursor position:', cursPos
return extBlock, cursPos
它完美地工作,但仅适用于一个数据块。我不能让它在从一个块到另一个块的注释行之间进行迭代。我正在考虑光标位置,但无法实现。请给我一些关于此的提示。谢谢。
更新 我实现了MattH的想法如下:
def blocks(seq):
buff = []
for line in seq:
if line.startswith('#'):
if buff:
#yield "".join(buff)
buff = []
else:
# I need to make those numbers float
line_spl = line.split()
pnt = [float(line_spl[k]) for k in range(len(line_spl))]
#print pnt
buff.append(Point(*pnt))
if buff:
yield "".join(buff)
然后,如果我运行它:
for block in blocks(extF.readlines()):
print 'p'
我只是空窗口,虽然print 'p'
循环中有for
。
所以,有几个问题:
是什么?
if buff:
yield "".join(buff)
做什么?当我评论它没有任何变化......
为什么for
- 循环内的命令不起作用?
这个函数是生成器,所以我无法访问之前处理过的行,是吗?
解决方案
我设法使用MattH和Ashwini Chaudhari的想法自己做。最后,我得到了这个:
def readBlock(extF):
countPnts = 0
extBlock = []
line = extF.readline()
if line.startswith('#'):
line = extF.readline()
else:
while not line.startswith('#'):
extPnt = Point(*[float(j) for j in line.split()])
countPnts += 1
extBlock.append(extPnt)
line = extF.readline()
return extBlock, countPnts
运行它:
while extF.readline():
block, pntNum = readBlock(extF)
它完全符合我的需要。
谢谢大家。
答案 0 :(得分:2)
这是两个简单的生成器,一个生成所有非注释块,另一个生成注释之间的非注释块。更新了两种不同的可能性,并进行了更新,以便在同一功能中进行分割和连接以保持一致性。
sample = """Don't yield this
# 0 867.691994 855.172889 279.230411 -78.951239 55.994189 -164.824148
# 0 872.477810 854.828159 279.690170 -78.950558 55.994391 -164.823700
...
893.270609 1092.179289 184.692319
907.682255 1048.809187 112.538457
...
# 0 877.347791 854.481104 280.214892 -78.949869 55.994596 -164.823240
...
893.243290 1091.395104 184.726720
907.682255 1048.809187 112.538457
...
# 0 882.216053 854.135168 280.745489 -78.948443 55.996206 -164.821887
Don't yield this either"""
def blocks1(text):
"""All non-comment blocks"""
buff = []
for line in text.split('\n'):
if line.startswith('#'):
if buff:
yield "\n".join(buff)
buff = []
else:
buff.append(line)
if buff:
yield "\n".join(buff)
def blocks2(text):
"""Only non-comment blocks *between* comments"""
buff = None
for line in text.split('\n'):
if line.startswith('#'):
if buff is None:
buff = []
if buff:
yield "\n".join(buff)
buff = []
else:
if buff is not None:
buff.append(line)
for block in blocks2(sample):
print "Block:\n%s" % (block,)
产地:
Block:
...
893.270609 1092.179289 184.692319
907.682255 1048.809187 112.538457
...
Block:
...
893.243290 1091.395104 184.726720
907.682255 1048.809187 112.538457
...
答案 1 :(得分:0)
data.txt中:
123456
1234
# 0 867.691994 855.172889 279.230411 -78.951239 55.994189 -164.824148
# 0 872.477810 854.828159 279.690170 -78.950558 55.994391 -164.823700
...
893.270609 1092.179289 184.692319
907.682255 1048.809187 112.538457
...
# 0 877.347791 854.481104 280.214892 -78.949869 55.994596 -164.823240
...
893.243290 1091.395104 184.726720
907.682255 1048.809187 112.538457
...
# 0 882.216053 854.135168 280.745489 -78.948443 55.996206 -164.821887
1234
12345
程序:
with open('data.txt') as f:
lines=[x.strip() for x in f if x.strip()]
for i,x in enumerate(lines): #loop to find the first comment line
if x.startswith('#'):
ind=i
break
for i,x in enumerate(lines[::-1]): #loop to find the first comment line from the end
if x.startswith('#'):
ind1=i
break
for x in lines[ind+1:-ind1-1]:
if not x.startswith('#'):
print x
<强>输出:强>
...
893.270609 1092.179289 184.692319
907.682255 1048.809187 112.538457
...
...
893.243290 1091.395104 184.726720
907.682255 1048.809187 112.538457
...