Question

对于以下工作代码，我使用创建一个类实例来存储我的文件输出[a_string]的名称变量和文件对象本身[f_object]。我发现在第一个if语句中分配的变量没有出现在下面的elif语句中的范围内。

#Text file splitter, data bewteen the '*' lines are copied into new files.

class Output_file():
    def __init__(self,a_string='none',f_object='none'):
        self.name=a_string
        self.foutput=f_object

outputfile=Output_file()

n=0
filehandle=open('original file.txt')
for line in filehandle:

    if line[0]=='*': #find the '*' that splits the rows of data
        n+=1
        outputfile.name = 'original file_'+str(n)+'.txt'
        outputfile.foutput= open(outputfile.name,'w')
        outputfile.foutput.write(line)

    elif len(line.split()) ==5 and n > 0: #make sure the bulk data occurs in blocks of 5
        outputfile.foutput= open(outputfile.name,'a+')
        outputfile.foutput.write(line)


outputfile.foutput.close()

我是否必须使用类实例来存储文件名和对象，还是有更好的方法？

Answer 1

if或elif语句中定义的变量应出现在另一个中。例如：

>>> for i in range(5):
...  if i%2==0:
...   x = i
...   print(x)
...  else:
...   print(x)
... 
0
0
2
2
4

在块范围的语言中不会出现这种情况，但遗憾的是python不是块范围的，因此这应该可行。

请注意，要使其正常工作，您的name=必须在尝试使用之前执行。也就是说，您的if语句必须至少在elif语句之前执行一次。

您的代码缺少评论，但我认为您的数据看起来有点像：

 This is a header
 blah blah blah
 **************
 16 624 24 57 32
 352 73 47 76 3
 25 6 78 80 21 331
 **************
 234 234 4 64 7
 **************
 **************
 86 57 2 5 14
 4 8 3 634 7

并且您希望将它们拆分为单独的文件，但前提是它是“有效”数据。如果我想模仿你的风格，我会像这样编码：

def isSeparatorLine(line):
    return line[0] = '*'
def isValidLine(line):
    return len(line.split())==5

groupNum = 0
outputFile = None
with open('original file.txt') as original:
    for line in original:
        if isSeparatorLine(line):
            groupNum += 1
            outputFilename = 'original file_{}.txt'.format(groupNum)
            if outputFile:
                outputFile.close()
            outputFile = open(outputFilename, 'w')
            outputFile.write('New file with group {}'.format(groupNum))
        elif group>0 and isValidLine(line):
            outputFile.write(line)

但我个人更喜欢这样写：

from itertools import *

FILENAME = 'original file.txt'
FILENAME_TEMPLATE = 'stanza-{}.txt'

def isSeparatorLine(line):
    return all(c=='*' for c in line)
def isValidLine(line):
    return len(line.split())==5
def extractStanzas(text):
    """
        Yields: [stanza0:line0,line1,...], [stanza1:lineN,lineN+1,...], [stanza2:...]
        where each stanza is separated by a separator line, as defined above
    """
    for isSeparator,stanza in groupby(text.splitlines(), isSeparatorLine):
        if not isSeparator:
            yield stanza

with open(FILENAME) as f:
    stanzas = list(extractStanzas(f.read()))

for i,stanza in enumerate(stanzas[1:]):
    assert all(isValidLine(line) for line in stanza), 'invalid line somewhere: {}'.format(stanza)
    with open(FILENAME_TEMPLATE.format(i), 'w') as output:
        output.write('\n'.join(stanza))

内部的变量赋值，然后是循环

1 个答案: