Python:为什么每个键添加dict值取决于顺序?

时间:2013-07-14 23:08:27

标签: python dictionary

假设您有几个字典可以跟踪每个键的三个浮点值(在子字典中)。您希望能够以添加多个dicts中存在的键值的方式合并这些词典。

使用普通的dict更新,值会被覆盖,因此您将dict()

子类化
class StatementDict(dict):
    def add(self, statement):
        ann_id = statement[0]
        lvl_dict = statement[1]
        if ann_id in self:
            self[ann_id]['skill'] += lvl_dict['skill']
            self[ann_id]['knowledge'] += lvl_dict['knowledge']
            self[ann_id]['interest'] += lvl_dict['interest']
        else:
            self[ann_id] = lvl_dict

    def update(self, statement_dict):
        for statement in statement_dict.iteritems():
            self.add(statement)

然后你将想要合并/添加的词组放入一个普通的词典键中:

# Small example data that reproduces the error
few_statements = {}
few_statements['linkedin'] = {u'Homerun': {u'skill': 14.0,
                                           u'knowledge': 34.0,
                                           u'interest': 20.0}}
few_statements['tudelft'] = {u'Presentation': {u'skill': 14.0,
                                               u'knowledge': 34.0,
                                               u'interest': 20.0},
                             u'Future': {u'skill': 16.0,
                                         u'knowledge': 25.33,
                                         u'interest': 2.0},
                             u'Visual_perception': {u'skill': 20.46,
                                                    u'knowledge': 28.35,
                                                    u'interest': 4.0}}
few_statements['website'] = {u'Homerun': {u'skill': 1.0,
                                          u'knowledge': 3.0,
                                          u'interest': 2.0}}

few_statements['shareworks'] = {u'Presentation': {u'skill': 8.0,
                                                  u'knowledge': 20.0,
                                                  u'interest': 12.0},
                                u'Future': {u'skill': 17.0,
                                            u'knowledge': 26.33,
                                            u'interest': 3.0},
                                u'Visual_perception': {u'skill': 2.0,
                                                       u'knowledge': 3.0,
                                                       u'interest': 6.0}}

现在我们应该能够将这些键值对逐个添加到StatementDict(),或者使用StatementDict.update()方法。将源dicts添加到StatementDict的顺序与结果无关。

# First we try updating in one order
small_test1a = StatementDict()
for origin in ("tudelft", "website", "linkedin", "shareworks"):
    for st in few_statements[origin].iteritems():
        small_test1a.add(st)

# And then in another order
small_test2 = StatementDict()
for origin in ("linkedin", "shareworks", "tudelft", "website"):
    for st in few_statements[origin].iteritems():
        small_test2.add(st)

print "Different order, same result?", small_test1a == small_test2
                                                # False, but why?
for key in small_test1a:
    print "Desired:", key, small_test1a[key]
    print "Unexpected:", key, small_test2[key]
唉,添加dicts的顺序确实会影响结果。但是为什么,以及意外结果发生了什么?

Desired: Future {u'skill': 33.0, u'knowledge': 51.66, u'interest': 5.0}
Unexpected: Future {u'skill': 50.0, u'knowledge': 77.99, u'interest': 8.0}
Desired: Presentation {u'skill': 22.0, u'knowledge': 54.0, u'interest': 32.0}
Unexpected: Presentation {u'skill': 30.0, u'knowledge': 74.0, u'interest': 44.0}
Desired: Homerun {u'skill': 15.0, u'knowledge': 37.0, u'interest': 22.0}
Unexpected: Homerun {u'skill': 29.0, u'knowledge': 71.0, u'interest': 42.0}
Desired: Visual_perception {u'skill': 22.46, u'knowledge': 31.35, u'interest': 10.0}
Unexpected: Visual_perception {u'skill': 24.46, u'knowledge': 34.35, u'interest': 16.0}

在第二个顺序中添加dicts似乎会使首先放置的dict的值加倍(加两次?)。我不明白为什么会这样。如何可靠地发生所需的添加行为,与添加顺序无关?

我不明白的另一件事:为什么small_test1a的价值会在我制作新的StatementDict()并使用相同的值填充时发生变化?

运行以下行会导致small_test1a在循环的最后一次迭代中更改:

small_test1b = StatementDict()
for origin in ("tudelft", "website", "linkedin", "shareworks"):
    small_test1b.update(few_statements[origin])
print "\nDoes .update() function?", small_test1a == small_test1b
print small_test1a

P.S。使用我的实际数据,根本不会发生任何添加。相反,保留第一个放置的值。这与更新普通字典相同,其中值被覆盖。不幸的是,我无法用小测试数据重现这种行为。

1 个答案:

答案 0 :(得分:1)

执行此操作时:

self[ann_id] = lvl_dict

您为该特定字典创建self[ann_id]另一个名称(例如,“tudelft”的名称)。然后,当你做一个后续的时候:

self[ann_id]['skill'] += lvl_dict['skill']

您根据当前的lvl_dict修改了以前的collections.defaultdict(例如,在这种情况下,根据“网站”更改“tudelft”)。

对此的最小修复是copy第一个字典。但是,我可能会尝试使用if ann_id in self:,以便您可以完全取消defaultdict测试。当defaultdict创建一个新字典时,它将是一个新实例,因此不会修改任何现有字典。


在下面的评论中使用from collections import defaultdict class StatementDict(defaultdict): def __init__(self): defaultdict.__init__(self, lambda: {'skill': 0.0, 'knowledge': 0.0, 'interest': 0.0}) def add(self, statement): ... as before ... 和lambda函数的示例:

{{1}}