遍历不同的目录,但collections.Counter值是相同的

时间:2013-12-12 18:10:57

标签: python python-3.x dictionary

我正在使用collection.Counter函数按顺序创建其mime类型列表的路径字典。这是一个很棒的小模块,但是计数器不会改变它从路径到路径的值。

我有一个名为package_mime_types的字典,每个条目都是这样的:

 package_mime_types['/a/path/to/somewhere'] = [('text/plain'),('text/plain'),('application/msword')]...

可以想象,该词典中的值非常长。我正在尝试将其转换为这样的列表:

package_mime_types['/a/path/to/somewhere'] = ['text/plain':780, 'application/msword':400, 'audio/mp3':30]

这是我的小迭代,应该这样做:

    for package_path, mime_types_list in package_mime_types.items():
        c = collections.Counter(mime_types_list)

        package_mime_types[package_path] = c
    return package_mime_types

最终结果有效,但每个路径的所有Counter数组都完全相同。

/path1/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17,       'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4,    'application/xml': 1, 'text/x-sh': 1})
/path2/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})
/path3/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})
/path4/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})
/path5/ relates to Counter({'text/plain': 2303, 'audio/x-wav': 90, 'text/html': 17, 'application/msword': 17, 'application/x-trash': 6, 'application/x-tar': 4, 'application/xml': 1, 'text/x-sh': 1})

我错过了使用计数器的东西吗?

1 个答案:

答案 0 :(得分:0)

我现在正在面对自己。这根本不是Counter的问题,而是我在创建文件类型列表时所做的迭代。每次迭代填充我的字典时,我都没有创建一个新数组。所以所有文件都与每条路径相关联。

def find_mimes(package_paths):
package_mime_types = {}
mime_types_list =[]
## Walking through directories looking for the mime types
for package_path in package_paths:


    print(package_path, "is being walked through")
    for root, dirs, files in os.walk(package_path, followlinks = True):
        for file in files:
            if mimetypes.guess_type(os.path.join(root, file)) != (None, None):
                mime_types_list.append(mimetypes.guess_type(os.path.join(root, file))[0])

    package_mime_types[package_path] = mime_types_list

了解mime_types_list如何在迭代之上?这是一个静态变量。进入package_path循环修复它。

def find_mimes(package_paths):
package_mime_types = {}

## Walking through directories looking for the mime types
for package_path in package_paths:
            ##Setting mime_types_list array back to empty for every path. (Duh)
            ##Now mime_types_list will be empty before the walking starts

            mime_types_list =[]


    print(package_path, "is being walked through")
    for root, dirs, files in os.walk(package_path, followlinks = True):
        for file in files:
            if mimetypes.guess_type(os.path.join(root, file)) != (None, None):
                mime_types_list.append(mimetypes.guess_type(os.path.join(root, file))[0])

    package_mime_types[package_path] = mime_types_list