嵌套字典在适当位置修改时保持平均值更新

时间:2016-05-06 22:29:11

标签: python list class dictionary nested

我正在尝试存储一个带有(可能)重叠键的字典列表,并跟踪所有这些字典中每个键的平均值。我创建了一个主要工作的类,但如果我直接修改列表中的字典,则无法更新平均值。您是否可以指向一个实现,该实现跟踪列表中的整个字典的更改以及修改列表中的字典?

我希望能够分别通过调用类的实例(如列表或字典)来访问列表中的特定字典或其中一个平均字典。下面我提供了一个(可能稍微超过)该类的最小工作示例。

import numpy as np

class ListOfDicts(object):
    """
    Object to store 
        (i)  a list of dictionaries, self.y, and 
        (ii) a dictionary, self.x, which maps each unique key 
             of the dicts in self.y to the average of the value 
             of that key across all dicts in self.y

    """
    def __init__(self, list_):
        """
        self.y - list of dictionaries
        self.x - dictionary containing average of all entries across all
                 dicts in self.y

        """
        # Allow either an dict of dicts or list of dicts
        try:
            self.y = [ {k:v for k, v in i.iteritems() } for i in list_ ]
        except AttributeError:
            self.y = [ { k:v for k, v in enumerate(i) } for i in list_ ]
        self.x = self._update_x()

    def __repr__(self):
        cls = self.__class__.__name__
        return '%r(%r)' % (cls, repr(self.y))

    def __len__(self):
        return len(self.y)

    def __iter__(self):
        return iter(self.y)

    def iterkeys(self):
        return iter(self.y)

    def __contains__(self, key):
        """
        Returns true if key is either an index of self.y or 
        a key of a dict within self.y.
        The keys of all dicts within self.y are keys of self.x.

        """
        return (key in self.y) or (key in self.x)

    def __getitem__(self, key):
        """
        If key is an index of self.y, get the corresponding dict. 
        If instead the key is a key of self.x, return the value of x[key].

        """
        try:
            return self.y[key]
        except TypeError:
            return self.x[key]

    def __setitem__(self, key, valdict):
        """
        Set the value of a dict of self.y to the dictionary valdict, 
        then update the dictionary of average values, self.x.        
        If key is not an index of self.y, this will throw an error.

        """
        self.y[key] = valdict
        self.x = self._update_x()

    def __delitem__(self, key):
        """
        Remove a dict from self.y, then update self.x.
        This does not relabel other dictionary indices.

        """
        del self.y[key]
        self.x = self._update_x(self)

    def _update_x(self):
        """
        Calculate an average of the values of each unique key, k, 
        in the dictionaries within self.y:
            { k: <s[k] for s in self.y> }, 
        where <s[k] ... > is an average over all values of k in 
        each dictionary in y.

        """
        # Find the set of unique keys in the dictionaries of self.y
        keys = reduce(lambda x, y: x | y, [ set(i.keys()) for i in self.y ])
        # Calculate averages for each key and store them in a dictionary
        temp = { k : np.average([ s[k] for s in self.y if s.has_key(k) ])
                 for k in keys }
        return temp    

我将y定义为词典列表,将x定义为平均值的定义。我已修改__getitem__以首先查找列表y中的索引,如果失败,请键入平均值字典x中的键。我已修改__setitem__以使用新词典替换y中的指定词典,然后重新计算x中的平均值。

我可以按照自己的意愿使用该类的一些示例:

>>> test = ListOfDicts([{'a':0.5, 'b':0.5},{'b':0.4, 'c':0.6}])

>>> test
'ListOfDicts'("[{'a': 0.5, 'b': 0.5}, {'c': 0.6, 'b': 0.4}]")

>>> test.y
[{'a': 0.5, 'b': 0.5}, {'b': 0.4, 'c': 0.6}]

>>> test.x
{'a': 0.5, 'b': 0.45000000000000001, 'c': 0.59999999999999998}

>>> test[0]
{'a': 0.5, 'b': 0.5}

>>> test[1]
{'b': 0.4, 'c': 0.6}

>>> test['a']
0.5

>>> test['b']
0.45000000000000001

>>> test[0] = {'a':0.3, 'b':0.4, 'd':0.3}
>>> test.x
{'a': 0.29999999999999999, 'b': 0.40000000000000002, 
 'c': 0.59999999999999998, 'd': 0.29999999999999999}

以下内容会产生不良行为:

>>> test[0]['a'] = 0.0
>>> test[0]['b'] = 0.7
>>> test.x
{'a': 0.29999999999999999, 'b': 0.40000000000000002, 
 'c': 0.59999999999999998, 'd': 0.29999999999999999}

我希望test.x能够:

>>> test.x
{'a': 0.0,  'b': 0.55000000000000004, 'c': 0.59999999999999998, 
 'd': 0.29999999999999999}

也就是说,当我修改字典test[0]时,平均x字典不会更新,打印text.x会返回非当前字段。

我有两个问题,

  1. 是否有一种解决此问题的好方法,并在y中对字典进行任何修改会触发x的更新?

  2. 有没有更好的方法来实现我的总体目标,即跟踪从字典列表派生的平均数量(和其他值),因为基础字典被修改了?

  3. 关于使用该课程的一个特定细节是,x的查看次数会比y更改,因此我不希望每次重新计算x它叫做。

    如果您想了解更多详情,或者您有任何其他问题,请告诉我。提前感谢您的帮助!

0 个答案:

没有答案