我有混合文件的文件夹,只想要计算图像类型文件。以下是返回目录中的总文件,而不仅仅是图像。我做错了什么?
extensions = ['.jpg','.png','.gif']
DL_path = os.getcwd()
for dirpath, dirnames, files in os.walk(DL_path):
for original_file in files:
todays_files = sum(1 for x in files if any(needle in original_file for needle in extensions))
print(todays_files)
如果我有一个jpg,一个png和两个txt文件。 todays_files应返回2,但它返回4。
答案 0 :(得分:3)
您可以使用set
来避免重复的项目:
>>> found_extensions = set()
>>> found_extensions.add('.png')
>>> found_extensions.add('.png') # try to add .png again
>>> found_extensions
{'.png'} # <-- appear only once
import os
extensions = {'.jpg','.png','.gif'} # set literal
found_extensions = set()
for dirpath, dirnames, files in os.walk(os.getcwd()):
for f in files:
found_extensions.add(os.path.splitext(f)[-1])
# ^-- duplicated item is not added
print(extensions & found_extensions) # to get itersection (&) => filter
print(len(extensions & found_extensions))
更新获取每个目录的匹配文件数:
import os
extensions = {'.jpg','.png','.gif'} # set literal
for dirpath, dirnames, files in os.walk(os.getcwd()):
count = sum(os.path.splitext(f)[-1] in extensions for f in files)
print(dirpath, count)
os.path.splitext(f)[-1] in extensions
将检查文件是否具有所需的扩展名,并返回True
(= 1)/ False
(= 0)。总结它们会给你想要的。
>>> True == 1
True
>>> False == 0
True
>>> sum([True, False, False, True, False])
2
答案 1 :(得分:0)
迭代我想念循环的内容。original_file
循环文件名中的每个字符。由于非空字符串的计算结果为True
,因此any
函数始终返回True
。所以你要计算每个文件。
相反,您可以获取每个文件的扩展名,然后检查它是否在您关注的文件类型列表中。
import os
extensions = ['.jpg','.png','.gif']
DL_path = os.getcwd()
todays_files = []
for dirpath, dirnames, files in os.walk(DL_path):
for original_file in files:
filename, file_extension = os.path.splitext(original_file)
if file_extension in extensions:
todays_files.append(original_file)
print(dirpath, len(todays_files))