我正在使用collection.Counter函数按顺序创建其mime类型列表的路径字典。这是一个很棒的小模块,但是计数器不会改变它从路径到路径的值。
我有一个名为package_mime_types的字典,每个条目都是这样的:
package_mime_types['/a/path/to/somewhere'] = [('text/plain'),('text/plain'),('application/msword')]...
可以想象,该词典中的值非常长。我正在尝试将其转换为这样的列表:
package_mime_types['/a/path/to/somewhere'] = ['text/plain':780, 'application/msword':400, 'audio/mp3':30]
这是我的小迭代,应该这样做:
for package_path, mime_types_list in package_mime_types.items():
c = collections.Counter(mime_types_list)
package_mime_types[package_path] = c
return package_mime_types
最终结果有效,但每个路径的所有Counter数组都完全相同。
/path1/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})
/path2/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})
/path3/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})
/path4/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})
/path5/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})
我错过了使用计数器的东西吗?
答案 0 :(得分:0)
我现在正在面对自己。这根本不是Counter的问题,而是我在创建文件类型列表时所做的迭代。每次迭代填充我的字典时,我都没有创建一个新数组。所以所有文件都与每条路径相关联。
def find_mimes(package_paths):
package_mime_types = {}
mime_types_list =[]
## Walking through directories looking for the mime types
for package_path in package_paths:
print(package_path, "is being walked through")
for root, dirs, files in os.walk(package_path, followlinks = True):
for file in files:
if mimetypes.guess_type(os.path.join(root, file)) != (None, None):
mime_types_list.append(mimetypes.guess_type(os.path.join(root, file))[0])
package_mime_types[package_path] = mime_types_list
了解mime_types_list如何在迭代之上?这是一个静态变量。进入package_path循环修复它。
def find_mimes(package_paths):
package_mime_types = {}
## Walking through directories looking for the mime types
for package_path in package_paths:
##Setting mime_types_list array back to empty for every path. (Duh)
##Now mime_types_list will be empty before the walking starts
mime_types_list =[]
print(package_path, "is being walked through")
for root, dirs, files in os.walk(package_path, followlinks = True):
for file in files:
if mimetypes.guess_type(os.path.join(root, file)) != (None, None):
mime_types_list.append(mimetypes.guess_type(os.path.join(root, file))[0])
package_mime_types[package_path] = mime_types_list