如何创建时间序列中最后N个项目的运行平均值?

时间:2014-10-05 16:00:46

标签: python linked-list

我的基本想法是创建一个链表,当每个新值进入时,加上新值的1 / N倍并减去第一个值的1 / N倍,然后将指针移到第一个并且自由与第一个相关的记忆。

这最终不会在Python中实现,只是为了让我的脑海中清楚地看到这个过程,我试着用Python写它,但是我的实现是有缺陷的。我需要一个双重链表吗?是否有更好的替代方法(不基于链表)?

到目前为止,这是我的尝试:

class Link:
    def __init__(self,val):
        self.next = None
        self.value = val

class LinkedList:
    def __init__(self,maxlength):
        self.current_link = None
        self.maxlength = maxlength
        self.sum = 0.
        self.average = None
        self.length = 0
        self._first_link = None
    def add_link(self,val):
        new_link = Link(val)
        new_link.next = self.current_link
        self.current_link = new_link
        if self._first_link is None:
            self._first_link = self.current_link
        self.sum += val
        if self.length < self.maxlength:
            self.length += 1
        else:
            self.sum -= self._first_link.value
            self._first_link = self._first_link.next # this line is flawed
        self.average = self.sum/self.length
    def get_first(self):
        return self._first_link.value

# Main
ll = LinkedList(5)
for ii in xrange(10):
    ll.add_link(ii)
    print ii,ll.get_first(),ll.average

问题是_first_link被设置为没有下一个的值。也就是说,_first_link被设置为添加的第一个项目,但是它的下一个是None,所以我不知道如何将它按照我想要的方式移动1。这让我想知道是否需要一个双重链表。

我很感激任何建议。

3 个答案:

答案 0 :(得分:1)

我认为最简单的实现是使用circular linked list(a.k.a. a ring ):

class Link(object):
    def __init__(self, value=0.0):
        self.next = None
        self.value = value

class LinkedRing(object):
    def __init__(self, length):
        self.sum = 0.0
        self.length = length
        self.current = Link()

        # Initialize all the nodes:
        last = self.current
        for i in xrange(length-1):  # one link is already created
            last.next = Link()
            last = last.next
        last.next = self.current  # close the ring

    def add_val(self, val):
        self.sum -= current.value
        self.sum += val
        self.current.value = val
        self.current = self.current.next

    def average(self):
        return self.sum / self.length


# Test example:
rolling_sum = LinkedRing(5)
while True:
    x = float(raw_input())
    rolling_sum.add_val(x)
    print(">> Average: %f" % rolling_sum.average())

答案 1 :(得分:1)

您可以使用collections.deque和数值稳定的数学来实现此目的,以维持运行平均值:

import collections

class AveragingBuffer(object):
    def __init__(self, maxlen):
        assert( maxlen>1)
        self.q=collections.deque(maxlen=maxlen)
        self.xbar=0.0
    def append(self, x):
        if len(self.q)==self.q.maxlen:
            # remove first item, update running average
            d=self.q.popleft()
            self.xbar=self.xbar+(self.xbar-d)/float(len(self.q))
        # append new item, update running average
        self.q.append(x)
        self.xbar=self.xbar+(x-self.xbar)/float(len(self.q))


if __name__=="__main__":
    import scipy
    ab=AveragingBuffer(10)
    for i in xrange(32):
        ab.append(scipy.rand())
        print ab.xbar, scipy.average(ab.q), len(ab.q)

答案 2 :(得分:0)

好的,我想到了一个在O [1]时间内有效的解决方案。我仍然很好奇是否有人有基于链表的解决方案,但这个解决方案完全避免了LL:

class Recent:
    def __init__(self,maxlength):
        self.maxlength = maxlength
        self.length = 0
        self.values = [0 for ii in xrange(maxlength)]
        self.index = 0
        self.total = 0.
        self.average = 0.
    def add_val(self,val):
        last = self.values[self.index%self.maxlength]
        self.values[self.index%self.maxlength] = val
        self.total += val
        self.total -= last
        if self.length < self.maxlength:
            self.length += 1
        self.average = self.total / self.length
        self.index += 1
    def print_vals(self):
        print ""
        for ii in xrange(self.length):
            print ii,self.values[ii%self.maxlength]
        print "average:",self.average

# Example to show it works
rr = Recent(5)
for ii in xrange(3):
    rr.add_val(ii)
rr.print_vals()
for ii in xrange(13):
    rr.add_val(ii)
rr.print_vals()