Question

我是python的初学者，我试图通过路径读取目录和子目录中的所有文件，并将相关的文件合并到一个文件中。同时我想排除一些特定的子目录，我在这一步陷入困境。感谢专家提供的任何帮助!!

以下是一些细节

主要途径：

/usr/home/micro/**/*.txt

要跳过的子目录：

 /usr/home/micro/frame/test

到目前为止我的Python代码

#!/usr/bin/env python3

import os
import sys
import glob

complete = glob.glob('/usr/home/micro/**/*.txt', recursive=True) 

def test():
    with open("results.txt", "w") as f:
        for name in complete:
            for root, dirs, files in os.walk("/usr/home/micro/frame/"):
                for skipped in ("/usr/home/micro/frame/test"):
                    if skipped in dirs:
                        dirs.remove(skipped)
            with open(name) as currentfile:
                current = currentfile.read()
                    f.write(current)

def main ():
    test()

main()

Answer 1

for skipped in ("/usr/home/micro/frame/test"):行没有按照您的想法行事。它不是在元组上迭代，而是在单个路径的字符上迭代。

您需要在括号结束前使用逗号才能使其成为元组：("/usr/home/micro/frame/test",)。如果没有逗号，则括号只是操作提示的不必要顺序（如(2*2)+1与2*2+1的相同方式）。或者，如果您只想要排除一条路径，则可以完全摆脱循环。

这不会自己修复代码，因为除了尝试排除不需要的文件夹之外，您在os.walk循环中实际上并没有做任何有用的事情。但是如果你摆脱complete上的循环并使用files中的os.walk迭代，你就可以做我想要的了。

尝试这样的事情：

def test():
    with open("results.txt", "w") as f:
        for root, dirs, files in os.walk("/usr/home/micro/frame/"):   # get rid of first loop
            for skipped in ("/usr/home/micro/frame/test",):       # add comma to make a tuple
                if skipped in dirs:
                    dirs.remove(skipped)
            for name in files:           # move the rest of the logic inside the os.walk loop
                fullname = os.path.join(root, name)
                with open(fullname) as currentfile:
                    current = currentfile.read()
                f.write(current)

Answer 2

只需跳过以您的路径开头的名称：

def test():
    with open("results.txt", "w") as f:
        for name in complete:
            if name.startswith("/usr/home/micro/frame/test"):
                continue
            with open(name) as currentfile:
                current = currentfile.read()
                    f.write(current)

如果已经是os.walk，则无需glob.glob。

Answer 3

似乎要跳过的目录的名称是＆＃34; / usr / home / micro / frame / test＆＃34;文件。然后，您需要将此文件读取到列表并使用该列表：

with open("skipfilenames.txt", "r") as f:
    skiplist = f.read().splitlines()

然后你可以使用：

for skipped in skiplist:
...

如何在进行多次读取时排除目录？

3 个答案: