Question

我有一个时间序列，其中浮动索引代表从实验开始算起的分钟。观测值不是完全等距的。我正在做一个滚动平均。这里有一些示例数据：

S = pd.Series([0,3,2,6,4,7,7,9,11,13,12,12,11,9,6,7,3,5,4], 
              index=[0.01,0.13,0.2,0.29,0.4,0.5,0.59,0.68,0.79,0.9,1.0,1.1,1.19,1.29,1.4,1.5,1.6,1.71,1.8])
Sr = S.rolling(3, win_type='triang', center=True).mean()

在我的真实数据中，窗口跨越数百个数据点。因此，我希望它始终跨越同一时间（以索引单位），而不是固定数量的观察。我发现这在日期时间索引上是可能的，但是我需要将索引浮动以进行进一步计算。有没有办法做到这一点，而不必将索引转换为日期时间并再次返回？

伪函数：

Sr = S.rolling(0.3, win_type='triang', center=True, *on=index*).mean()

此示例的预期输出：

对于每个索引 i：从 i-0.15 到 i+0.15 的窗口平均值（根据与 i 的距离进行三角加权）

Answer 1

我认为 rolling 方法无法做到这一点。

出于兴趣，可以手动完成，如下所示：

from scipy.signal.windows import triang
import numpy as np
import pandas as pd

def triangular(a):
    n = a.size
    b = triang(n) / (n - 1)
    return b @ a

S = pd.Series([0,3,2,6,4,7,7,9,11,13,12,12,11,9,6,7,3,5,4],
              index=[0.01,0.13,0.2,0.29,0.4,0.5,0.59,0.68,0.79,0.9,1.0,1.1,1.19,1.29,1.4,1.5,1.6,1.71,1.8])

df = pd.DataFrame({'S': S})
df['neighbours'] = df.index.to_series().apply(lambda x: [df.loc[index][0] for index in df.index if x - 0.15 < index <= x + 0.15])
df['rolling_mean'] = df.neighbours.apply(lambda x: triangular(np.array(x)))
df.drop('neighbours', axis=1, inplace=True)

print(df)

输出：

       S  rolling_mean
0.01   0          1.50
0.13   3          2.00
0.20   2          3.25
0.29   6          4.50
0.40   4          5.25
0.50   7          6.25
0.59   7          7.50
0.68   9          9.00
0.79  11         11.00
0.90  13         12.25
1.00  12         12.25
1.10  12         11.75
1.19  11         10.75
1.29   9          8.75
1.40   6          7.00
1.50   7          5.75
1.60   3          4.50
1.71   5          4.25
1.80   4          4.50

但是，我怀疑这是否比将浮点索引转换为日期时间然后再转换回来更简单。

带有浮动索引的不规则系列上的熊猫滚动窗口

1 个答案: