Question

如果有人可以帮助我，我会很高兴。我对Python几乎一无所知，所以请原谅我的天真。我花了两天时间阅读这个网站试图超越我所处的位置。

我写了这段代码（大部分都在这个网站上看到了它）：

    import os
    path = '/the/path/to/the/I want/to/count'
    file_count = sum((len(f) for _, _,f in os.walk(path)))
    print "Number of files: ",file_count

我得到了我的文件数，但需要一段时间。有更快的代码吗？它进入子目录，我认为文件计数高于我的预期。

我的最终目标是根据每个文件的前两个字母计算文件数。即。 AL，AR，AZ。我可以得到一个我必须添加的例子吗？

Answer 1

是的，os.walk()遍历子目录。

如果您需要按前两个字母分组的计数，我会使用collections.Counter()类：

import os
from collections import Counter

path = '/the/path/to/the/I want/to/count'
counts = Counter(fname[:2] for _, _, files in os.walk(path) for fname in files)
for initials, count in counts.most_common():
    print '{}: {:>20}'.format(initials, count)

这将遍历子目录，并收集按所遇到的每个文件名的前两个字符分组的计数，然后打印由大多数排序到最不常见的计数。

如果不想要遍历子目录，请改用os.listdir();它只返回给定目录中的名称（包括文件名和目录名）。然后，您可以使用os.path.isfile()过滤掉那些只是文件名的名称：

counts = Counter(fname[:2] for fname in os.listdir(path) if os.path.isfile(os.path.join(path, fname)))

如果您要查找具有特定扩展名的文件，请查找该扩展名而不是isfile()测试;大概没有子目录将使用相同的扩展名：

counts = Counter(fname[:2] for fname in os.listdir(path) if fname.endswith('.pdf'))

Answer 2

你可以尝试

len(glob.glob('/the/path/to/the/I want/to/count/AL*'))
len(glob.glob('/the/path/to/the/I want/to/count/AR*'))

等

根据前两个字母计算文件

2 个答案: