列表

时间:2018-03-23 10:47:06

标签: python pandas numpy mean

什么是计算列表均值的pythonic方法,但只考虑正值?

所以,如果我有价值观 [1,2,3,4,5,-1,4,2,3]我想计算三个值的滚动平均值,它基本上是计算[1,2,3,4,5]的平均滚动平均值, '南',4,2,3]。 而这就变成了 [nan,2,3,4,4.5,4.5,3,nan]其中第一个和最后一个纳米是由于缺少的元素。 2 =平均值([1,2,3]) 3 =平均值([2,3,4]) 但4.5 =平均值([4,5,nan])=平均值([4,5]) 等等。所以重要的是,当存在负值时,它们被排除在外,但是除数是在正值的数量之间。

我试过了:

def RollingPositiveAverage(listA,nElements):
     listB=[element for element in listA if element>0]
     return pd.rolling_mean(listB,3)

但是列表B缺少元素。我试图用纳米替换那些元素,然后平均值变成纳米本身。

有什么好的和优雅的方法可以解决这个问题吗?

由于

2 个答案:

答案 0 :(得分:3)

因为您正在使用Pandas:

import numpy as np
import pandas as pd

def RollingPositiveAverage(listA, window=3):
     s = pd.Series(listA)
     s[s < 0] = np.nan
     result = s.rolling(window, center=True, min_periods=1).mean()
     result.iloc[:window // 2] = np.nan
     result.iloc[-(window // 2):] = np.nan
     return result  # or result.values or list(result) if you prefer array or list

print(RollingPositiveAverage([1, 2, 3, 4, 5, -1, 4, 2, 3]))

输出:

0    NaN
1    2.0
2    3.0
3    4.0
4    4.5
5    4.5
6    3.0
7    3.0
8    NaN
dtype: float64

纯Python版本:

import math

def RollingPositiveAverage(listA, window=3):
    result = [math.nan] * (window // 2)
    for win in zip(*(listA[i:] for i in range(window))):
        win = tuple(v for v in win if v >= 0)
        result.append(float(sum(win)) / min(len(win), 1))
    result.extend([math.nan] * (window // 2))
    return result

print(RollingPositiveAverage([1, 2, 3, 4, 5, -1, 4, 2, 3]))

输出:

[nan, 2.0, 3.0, 4.0, 4.5, 4.5, 3.0, 3.0, nan]

答案 1 :(得分:2)

获取滚动求和并获得参与正元素掩码的滚动求和的有效元素的计数,并将它们简单地除以平均值。对于滚动求和,我们可以使用np.convolve

因此,实施 -

def rolling_mean(a, W=3):
    a = np.asarray(a) # convert to array
    k = np.ones(W) # kernel for convolution

    # Mask of positive numbers and get clipped array
    m = a>=0
    a_clipped = np.where(m,a,0)

    # Get rolling windowed summations and divide by the rolling valid counts
    return np.convolve(a_clipped,k,'same')/np.convolve(m,k,'same')

扩展到边界的NaN-padding的特定情况 -

def rolling_mean_pad(a, W=3):
    hW = (W-1)//2 # half window size for padding
    a = np.asarray(a) # convert to array
    k = np.ones(W) # kernel for convolution

    # Mask of positive numbers and get clipped array
    m = a>=0
    a_clipped = np.where(m,a,0)

    # Get rolling windowed summations and divide by the rolling valid counts
    out = np.convolve(a_clipped,k,'same')/np.convolve(m,k,'same')
    out[:hW] = np.nan
    out[-hW:] = np.nan
    return out  

示例运行 -

In [54]: a
Out[54]: array([ 1,  2,  3,  4,  5, -1,  4,  2,  3])

In [55]: rolling_mean_pad(a, W=3)
Out[55]: array([ nan,  2. ,  3. ,  4. ,  4.5,  4.5,  3. ,  3. ,  nan])