Question

我需要通过所有子文件夹从我的父路径（tutu）os.walk。对于每一个，每个最深的子文件夹都有我需要用我的代码处理的文件。对于所有具有文件的最深文件夹，文件'layout'是相同的：一个文件* .adf.txt，一个文件* .idf.txt，一个文件* .sdrf.txt和一个或多个文件* .dat。，如图所示。 enter image description here 我的问题是我不知道如何使用os模块从我的父文件夹顺序迭代到所有子文件夹。我需要一个函数，对于os.walk中的当前子文件夹，如果该子文件夹为空，则继续到该子文件夹内的子子文件夹（如果存在）。如果存在，则验证该文件布局是否存在（这没有问题......），如果是，则应用代码（也没问题）。如果没有，如果该文件夹没有更多的子文件夹，则返回到父文件夹，然后返回到下一个子文件夹的os.walk，并将所有子文件夹放到我的父文件夹（tutu）中。要恢复，我需要一些如下所示的函数（用python / imaginary代码混合编写）：

for all folders in tutu:
    if os.havefiles in os.walk(current_path):#the 'havefiles' don´t exist, i think...
        for filename in os.walk(current_path):
            if 'adf' in filename:
                etc...
                #my code
    elif:
        while true:
            go deep
    else:
        os.chdir(parent_folder)

你认为在我的代码中调用这个定义是最好的定义吗？

这是我尝试使用的代码，当然没有成功：

import csv
import os
import fnmatch

abs_path=os.path.abspath('.')
for dirname, subdirs, filenames in os.walk('.'):
    # print path to all subdirectories first.
    for subdirname in subdirs:
        print os.path.join(dirname, subdirname), 'os.path.join(dirname, subdirname)'
        current_path= os.path.join(dirname, subdirname)
        os.chdir(current_path)
        for filename in os.walk(current_path):
            print filename, 'f in os.walk'
            if os.path.isdir(filename)==True:
                break
            elif os.path.isfile(filename)==True:
                print filename, 'file'
        #code here

提前致谢...

Answer 1

我需要一个函数，对于os.walk中的当前子文件夹，如果该子文件夹为空，则继续到该子文件夹内的子子文件夹（如果存在）。

这没有任何意义。如果文件夹为空，则它没有任何子文件夹。

也许你的意思是，如果它有没有常规文件，那么递归到它的子文件夹，但如果它有任何，请不要递归，而是检查布局？

要做到这一点，你需要的就是这样：

for dirname, subdirs, filenames in os.walk('.'):
    if filenames:
        # can't use os.path.splitext, because that will give us .txt instead of .adf.txt
        extensions = collections.Counter(filename.partition('.')[-1] 
                                         for filename in filenames)
        if (extensions['.adf.txt'] == 1 and extensions['.idf.txt'] == 1 and
            extensions['.sdrf.txt'] == 1 and extensions['.dat'] >= 1 and
            len(extensions) == 4):
            # got a match, do what you want

        # Whether this is a match or not, prune the walk.
        del subdirs[:]

我在这里假设您只想查找具有完全指定文件的目录，而不是其他目录。要删除最后一个限制，只需删除len(extensions) == 4部分。

无需显式迭代subdirs或任何内容，也无需从os.walk内部递归调用os.walk。 walk的重点是它已经递归访问它找到的每个子目录，除非你明确告诉它不要（通过修剪它给你的列表）。

Answer 2

os.walk将自动以递归的方式“挖掘”，因此您无需自己递归树。

我认为这应该是代码的基本形式：

import csv
import os
import fnmatch

directoriesToMatch = [list here...]
filenamesToMatch = [list here...]

abs_path=os.path.abspath('.')
for dirname, subdirs, filenames in os.walk('.'):
    if len(set(directoriesToMatch).difference(subdirs))==0:     # all dirs are there
        if len(set(filenamesToMatch).difference(filenames))==0: # all files are there
            if <any other filename/directory checking code>:
                # processing code here ...

根据python文档，如果您因任何原因不想继续递归，只需删除子目录中的条目： http://docs.python.org/2/library/os.html

如果您想要检查没有找到要处理的文件的子目录，您还可以将dirs检查更改为：

    if len(subdirs)==0: # check that this is an empty directory

我不确定我是否完全理解这个问题，所以我希望这会有所帮助！

编辑：

好的，所以如果您需要检查没有文件，请使用：

    if len(filenames)==0:

但正如我上面所说，最好只查看FOR特定文件，而不是检查空目录。

需要'if os.havefiles'函数在python中搜索子文件夹

2 个答案: