Question

我正在尝试遍历目录并读取每个子目录的文件。但是，我需要跟踪子目录的名称，因为它是一个时间值。我设法创建了一个如下字典：

dict = {'time1/dir1':['file1.ext', 'file2.ext'], 'time2/dir2':['name1.ext', 'name2.ext'}

但我无法找到将文件的完整名称传递给函数的正确方法。

当我尝试使用np.fromfile()时，我需要递归地将目录/时间的名称与列表中的每个文件一起加入，并以我所拥有的方式存储它们：

dict2 = {'time1/dir1':[value1, value2], 'time2/dir2':[value1, value2], }

我还将目录作为pandas DataFrame读取，但我仍然需要以时间一致的方式读取文件。

我尝试过使用和混合os.walk(), os.path.join(), os.listdir(), glob.glob()和其他人，但在使用这些功能时我的逻辑可能是错误的。

我知道可能有一种更健壮，更简单的方法来直接循环并维护时间戳/目录名，而不是创建大量的目录和列表。

Answer 1

这就是你要找的东西吗？

import os
import os.path
base_path = "my/base/path"
directory_generator = os.walk(base_path)
next(directory_generator)
path_tree = {}
for root_path, directories, files in directory_generator:
    path_tree[os.path.basename(root_path)] = [
        os.path.join(root_path, file_path) for file_path in files]

结果如下：

{
    "dir1": [
        "my/full/path/dir1/file1.ext",
        "my/full/path/dir1/file2.ext"
    ],
    "dir2": [
        "my/full/path/dir2/anotherfile1.ext",
        "my/full/path/dir2/anotherfile2.ext"
    ],
}

如何连接子目录和文件的名称进行处理 - python

1 个答案: