Question

我从昨天开始一直在谷歌上搜索这个问题而无济于事;

当我遍历一个目录中的多个文件，并处理该循环中每个文件的行时，我总是关闭，但好像python打开了同一内存空间中的所有文件，所以当我循环遍历文件我从先前打开的文件中检索所有记录，就像它们在指针数组中一样。。。 .wtf？

    import os
    import sys
    import glob
    import string
    import cPickle
    path2 = './'
    columnShuffleTable = loadColumnTable('myTable') #func previously defined
    codeScrambleTable = loadScrambleTable('theirTable') #func previously defined
    pathToFiles2 = glob.glob(os.path.join(path2, '*.DAT'))

    for curFile in pathToFiles2:    
        _list = ['',] 
        #this is the variable with which I'm having a problem
        unscrambledCodes = file(curFile[-10:], 'r') 
        #this always yields the actual first line of the file at which I am currently at
        line = unscrambledCodes.readline() 
        _list[0] = '|' + line.strip() #stripping trailing spaces
        #the list length at this point always equates to '1', so up to here everything is great
        print "list length:", len(_list) 
        # this always reads the 2nd line of the very first file I loaded. . .wtf?
        line = unscrambledCodes.readline().strip() 

        while(line):
            #for unscrambledCodes [my input file] 
            print "len list: ", len(_list), "infile", unscrambledCodes 
            nextLine = unscrambledCodes.readline().strip()

            if not nextLine:
                _list.append('|' + line)
                break
            else:
                _list.append( '|' + line[:-14] + scrambleCode(line[-12:], columnShuffleTable, codeScrambleTable))
            #end if

            line = nextLine
        unscrambledCodes.close()
        outfile = open(curFile[-10:-4] + '.Scrambled', 'w')
        output = '\n'.join(_list)
        outfile.write(output)
        outfile.close()

根据要求，这是我的i / o样本：

输入文件1：
AB00007737106517 COSTCLASSU275
C000000010031932155750539976333693187714
C000000010031932155750539976105307608239

file2的：
AB00007736638744 COSTCLASSU275
C000000010030284907699012480608351468369
C000000020030284907699012480751885101503

file3的：
AB00007737148207 COSTCLASSU275
C000000010032271716759259098738354718484
C000000020032271716759259098394986919513

期望的输出文件1：
AB00007737148207 COSTCLASSU275
| C000000010031932155750539976079292077121
| C000000010031932155750539976126217711213

文件2：
AB00007736638744 COSTCLASSU275
| C000000010030284907699012480968864628712
| C000000020030284907699012480294550195814

文件3：
AB00007737106517 COSTCLASSU275
| C000000010032271716759259098216262704445
| C000000020032271716759259098085462231948

当前输出文件1：
AB00007737148207 COSTCLASSU275
| C000000010031932155750539976079292077121
| C000000010031932155750539976126217711213

文件2：
AB00007736638744 COSTCLASSU275
| C000000010031932155750539976079292077121
| C000000010031932155750539976126217711213
。
。
。
| C000000010030284907699012480968864628712
| C000000020030284907699012480294550195814
文件3：
AB00007737106517 COSTCLASSU275
| C000000010031932155750539976079292077121
| C000000010031932155750539976126217711213
。
。
。
| C000000010030284907699012480968864628712
| C000000020030284907699012480294550195814
。
。
。
| C000000010032271716759259098216262704445
| C000000020032271716759259098085462231948

Answer 1

是的，unscrambledCodes.readline（）将一次读取一行文件，递增到下一行，直到读入整个文件。

您可以使用以下内容：

content = unscrambledCodes.readlines()

将每行读入数组。然后，您可以遍历内容，并根据需要更新行。

另外，我通常使用

而不是file（）

myFile = open('filename.txt','r')
content = myFile.readlines()
myFile.close()

Answer 2

普遍的共识是使用open而不是file。我从那开始。

其次，尝试对打开的文件进行生成器理解，因为它更容易（下一个方法将返回换行符）为new_file=[x.strip() for x in unscrambledCodes)]，然后执行您必须执行的任何其他操作，例如{{1} }和new_file=["|"+line for line in new_file[:-1]]

正如上面的其他人指出的那样，你可能想尝试使用with关键字（即使它会带来另一个级别的缩进），比如

new_file[-1]=......

with open("....","r") as in_file, open("...","w") as out_file:

在循环遍历python2.7目录中的文件时读取文件时出错

2 个答案: