对于以下工作代码,我使用创建一个类实例来存储我的文件输出[a_string]的名称变量和文件对象本身[f_object]。我发现在第一个if语句中分配的变量没有出现在下面的elif语句中的范围内。
#Text file splitter, data bewteen the '*' lines are copied into new files.
class Output_file():
def __init__(self,a_string='none',f_object='none'):
self.name=a_string
self.foutput=f_object
outputfile=Output_file()
n=0
filehandle=open('original file.txt')
for line in filehandle:
if line[0]=='*': #find the '*' that splits the rows of data
n+=1
outputfile.name = 'original file_'+str(n)+'.txt'
outputfile.foutput= open(outputfile.name,'w')
outputfile.foutput.write(line)
elif len(line.split()) ==5 and n > 0: #make sure the bulk data occurs in blocks of 5
outputfile.foutput= open(outputfile.name,'a+')
outputfile.foutput.write(line)
outputfile.foutput.close()
我是否必须使用类实例来存储文件名和对象,还是有更好的方法?
答案 0 :(得分:3)
if
或elif
语句中定义的变量应出现在另一个中。例如:
>>> for i in range(5):
... if i%2==0:
... x = i
... print(x)
... else:
... print(x)
...
0
0
2
2
4
在块范围的语言中不会出现这种情况,但遗憾的是python不是块范围的,因此这应该可行。
请注意,要使其正常工作,您的name=
必须在尝试使用之前执行。也就是说,您的if
语句必须至少在elif
语句之前执行一次。
您的代码缺少评论,但我认为您的数据看起来有点像:
This is a header
blah blah blah
**************
16 624 24 57 32
352 73 47 76 3
25 6 78 80 21 331
**************
234 234 4 64 7
**************
**************
86 57 2 5 14
4 8 3 634 7
并且您希望将它们拆分为单独的文件,但前提是它是“有效”数据。如果我想模仿你的风格,我会像这样编码:
def isSeparatorLine(line):
return line[0] = '*'
def isValidLine(line):
return len(line.split())==5
groupNum = 0
outputFile = None
with open('original file.txt') as original:
for line in original:
if isSeparatorLine(line):
groupNum += 1
outputFilename = 'original file_{}.txt'.format(groupNum)
if outputFile:
outputFile.close()
outputFile = open(outputFilename, 'w')
outputFile.write('New file with group {}'.format(groupNum))
elif group>0 and isValidLine(line):
outputFile.write(line)
但我个人更喜欢这样写:
from itertools import *
FILENAME = 'original file.txt'
FILENAME_TEMPLATE = 'stanza-{}.txt'
def isSeparatorLine(line):
return all(c=='*' for c in line)
def isValidLine(line):
return len(line.split())==5
def extractStanzas(text):
"""
Yields: [stanza0:line0,line1,...], [stanza1:lineN,lineN+1,...], [stanza2:...]
where each stanza is separated by a separator line, as defined above
"""
for isSeparator,stanza in groupby(text.splitlines(), isSeparatorLine):
if not isSeparator:
yield stanza
with open(FILENAME) as f:
stanzas = list(extractStanzas(f.read()))
for i,stanza in enumerate(stanzas[1:]):
assert all(isValidLine(line) for line in stanza), 'invalid line somewhere: {}'.format(stanza)
with open(FILENAME_TEMPLATE.format(i), 'w') as output:
output.write('\n'.join(stanza))