计算熊猫数据帧的滚动相关性

时间:2019-08-07 06:00:46

标签: python pandas

是否可以在熊猫中使用滚动窗口和相关函数来将较短的数据帧或序列与较长的数据帧或序列进行相关,并获得较长时间序列的结果?基本上执行numpy.correlate方法的作用,但不是互相关,而是进行成对相关。

x= [0,1,2,3,4,5,4,7,6,9,10,5,6,4,8,7]
y= [4,5,4,5]
print(x)
print(y)
corrs = []
for i in range(0,len(x)-3):
    corrs.append( np.corrcoef(x[i:i+4],y)[0,1] )

结果为:

[0.4472135954999579, 0.4472135954999579, 0.4472135954999579, 0.0, 0.8164965809277259, -0.4472135954999579, 0.8320502943378437, 0.0, -0.24253562503633297, 0.24253562503633297, -0.7683498199278325, 0.8451542547285166, -0.50709255283711]

窗口和成对的每种组合都会给出一系列的NAN或“ ValueError:长度不匹配”。在我做的简单测试案例中,它总是NAN或单个结果,但是没有窗口。

x = pd.DataFrame(x)
y = pd.DataFrame(y)

corr = y.rolling(np.shape(y)[0]).corr(x)
print(corr)
corr = y.rolling(np.shape(x)[0]).corr(x)
print(corr)
corr = x.rolling(np.shape(x)[0]).corr(y)
print(corr)
corr = x.rolling(np.shape(y)[0]).corr(y)
print(corr)
corr = y.rolling(np.shape(y)[0]).corr(x,pairwise=True)
print(corr)
corr = y.rolling(np.shape(x)[0]).corr(x,pairwise=True)
print(corr)
corr = x.rolling(np.shape(x)[0]).corr(y,pairwise=True)
print(corr)
corr = x.rolling(np.shape(y)[0]).corr(y,pairwise=True)
print(corr)

1 个答案:

答案 0 :(得分:1)

Rolling.applynp.corrcoefSeries.corr一起使用具有相同索引值的y-必要时Series.reset_indexdrop=True

x= [0,1,2,3,4,5,4,7,6,9,10,5,6,4,8,7]
y= [4,5,4,5]

corrs = []
for i in range(0,len(x)-3):
    corrs.append( np.corrcoef(x[i:i+4],y)[0,1] )

x = pd.Series(x)
y = pd.Series(y)

corr1 = x.rolling(np.shape(y)[0]).apply(lambda x: np.corrcoef(x, y)[0,1], raw=True)
corr2 = x.rolling(np.shape(y)[0]).apply(lambda x: x.reset_index(drop=True).corr(y), raw=False)

print (pd.concat([pd.Series(corrs).rename(lambda x: x + 3), corr1, corr2], axis=1))
           0         1         2
0        NaN       NaN       NaN
1        NaN       NaN       NaN
2        NaN       NaN       NaN
3   0.447214  0.447214  0.447214
4   0.447214  0.447214  0.447214
5   0.447214  0.447214  0.447214
6   0.000000  0.000000  0.000000
7   0.816497  0.816497  0.816497
8  -0.447214 -0.447214 -0.447214
9   0.832050  0.832050  0.832050
10  0.000000  0.000000  0.000000
11 -0.242536 -0.242536 -0.242536
12  0.242536  0.242536  0.242536
13 -0.768350 -0.768350 -0.768350
14  0.845154  0.845154  0.845154
15 -0.507093 -0.507093 -0.507093