Python嵌套的defaultdict与混合数据类型

时间:2015-08-05 16:02:40

标签: python dictionary defaultdict

那么,我该如何为此创建一个defaultdict:

{
    'branch': {
        'count': 23,
        'leaf': {
            'tag1': 30,
            'tag2': 10
        }
    },
}

这样,我会counttag1tag2默认为零?我想在读取输入时动态填充dict。当我看到一个新的branch时,我想要创建一个count为零的dict和一个空的dict为leaf。当我得到一个leaf时,我想用它的名字创建一个键并将值设置为零。

更新: 接受Martijn的答案,因为它有更多的赞成,但其他答案同样好。

3 个答案:

答案 0 :(得分:3)

您无法使用defaultdict执行此操作,因为工厂无法访问密钥。

但是,你可以只是将dict子类创建为自己的“智能”defaultdict类似的类。提供您自己的__missing__ method,根据密钥添加值:

class KeyBasedDefaultDict(dict):
    def __init__(self, default_factories, *args, **kw):
        self._default_factories = default_factories
        super(KeyBasedDefaultDict, self).__init__(*args, **kw)

    def __missing__(self, key):
        factory = self._default_factories.get(key)
        if factory is None:
            raise KeyError(key)
        new_value = factory()
        self[key] = new_value
        return new_value

现在您可以提供自己的映射:

mapping = {'count': int, 'leaf': dict}
mapping['branch'] = lambda: KeyBasedDefaultDict(mapping)

tree = KeyBasedDefaultDict(mapping)

演示:

>>> mapping = {'count': int, 'leaf': dict}
>>> mapping['branch'] = lambda: KeyBasedDefaultDict(mapping)
>>> tree = KeyBasedDefaultDict(mapping)
>>> tree['branch']['count'] += 23
>>> tree['branch']['leaf']['tag1'] = 30
>>> tree['branch']['leaf']['tag2'] = 10
>>> tree
{'branch': {'count': 23, 'leaf': {'tag1': 30, 'tag2': 10}}}

答案 1 :(得分:3)

回答我自己的问题,但我认为这也有效:

def branch():
    return {
        'count': 0,
        'leaf': defaultdict(int)
    }

tree = defaultdict(branch)
tree['first_branch']['leaf']['cat2'] = 2
print json.dumps(tree, indent=2)

# {
#   "first_branch": {
#     "count": 0, 
#     "leaf": {
#       "cat2": 2
#     }
#   }
# }

答案 2 :(得分:2)

对象具有存储数据的__dict__,并允许您以编程方式设置默认值。还有一个名为Counter的对象,我认为你应该使用它来委托你的叶子计数。

因此,我建议您使用具有collections.Counter:

的对象
import collections

class Branch(object):
    def __init__(self, leafs=(), count=0):
        self.leafs = collections.Counter(leafs)
        self.count = count
    def __repr__(self):
        return 'Branch(leafs={0}, count={1})'.format(self.leafs, self.count)

BRANCHES = [Branch(['leaf1', 'leaf2']),
            Branch(['leaf3', 'leaf4', 'leaf3']),
            Branch(['leaf6', 'leaf7']),
           ]

用法:

>>> import pprint
>>> pprint.pprint(BRANCHES)
[Branch(leafs=Counter({'leaf1': 1, 'leaf2': 1}), count=0),
 Branch(leafs=Counter({'leaf3': 2, 'leaf4': 1}), count=0),
 Branch(leafs=Counter({'leaf7': 1, 'leaf6': 1}), count=0)]
>>> first_branch = BRANCHES[0]
>>> first_branch.count += 23
>>> first_branch
Branch(leafs=Counter({'leaf1': 1, 'leaf2': 1}), count=23)
>>> first_branch.leafs['leaf that does not exist']
0
>>> first_branch.leafs.update(['new leaf'])
>>> first_branch
Branch(leafs=Counter({'new leaf': 1, 'leaf1': 1, 'leaf2': 1}), count=23)