如何使用os.walk从子目录中的文件访问信息?

时间:2014-01-28 16:01:33

标签: python

我想从根目录访问和处理子目录中文件的信息。我已经尝试过使用os.walk,它可以让我找到文件,但是如何访问它们的内容呢?我希望这些子目录中的特定文件都具有相同的名称,但这些子目录中还有其他文件。这就是我的尝试:

import os
import numpy as np
for root, dirs, files in os.walk("/rootDir/"):
    for file in files:
        if file.endswith(('sum.txt')):
            print file  #Here, the desired file name is printed
            PIs = []
            for line in file:
                print line #Here, I only get 's' printed, which I believe is the first letter in 'sum.txt'
                line = line.rstrip()
                line = line.split('\t')
                PIs.append(line[2])
    print PIs #nothing is collected so nothing is printed

如何在根目录中的这些子目录中的所需文件中循环?

添加问题:

我得到了第一个问题的答案,现在我有了另一个问题。在根目录下的目录中有许多子目录。我想从所有目录中只有一个具有相同名称的子目录访问信息。这就是我试过的:

for root, dirs, files in os.walk("/rootPath/"):
  for dname in dirs:
    #print dname, type(dname)
    allPIs = []
    allDirs = []
    if dname.endswith('code_output'):  #I only want to access information from one file in sub-directories with this name
      ofh = open("sumPIs.txt", 'w')
      ofh.write("path\tPIs_mean\n")
      for fname in files: #Here i want to be in the code_output sub-directory
        print fname #here I only want to see files in the sub-directory with the 'code_output' end of a name, but I get all files in the directory AND sub-directory
        if fname.endswith('sumAll.txt'):
          PIs = []
          with open(os.path.join(root,fname), 'r') as fh_in:
            for line in fh_in:
              line = line.rstrip()
              line = line.split('\t')
              PIs.append(int(line[2]))
          PIs_mean = numpy.mean(PIs)
          allPIs.append(PIs_mean)
          allDirs.append(filePath)

为什么这会循环遍历目录中的所有文件而不仅仅是名称以'code_output'结尾的子目录?

2 个答案:

答案 0 :(得分:2)

使用with上下文处理程序打开文件句柄。退出with块时文件句柄已关闭,因此您不会意外地打开大量文件句柄。

同样file是Python中的内置类,所以最好不要将它用作变量的名称。

import os
PIs = []
for root, dirs, files in os.walk("/rootDir/"):
  for fname in files:
    if fname.endswith('sum.txt'):
      print fname  #Here, the wanted file name is printed
      with open(os.path.join(root,fname), 'r') as fh_in:
        for line in fh_in:
          print line # here I only get 's' printed, which I believe is the first letter in 'sum.txt'
          line = line.rstrip()
          line = line.split('\t')
          PIs.append(line[2])
print PIs #nothing is collected so nothing is printed

答案 1 :(得分:0)

尽量不要为变量名file使用关键字。使用f,file_等......

file是一个字符串更改行

for line in file_

通过

for line in open(file_).readlines()