如何使用pandas apply_corr()

时间:2014-03-19 09:39:10

标签: python numpy pandas

我刚刚发布了这个,但没人能解决问题。

首先让我们创建一些相关的DataFrame并使用rolling_corr()调用dropna(),因为我将稍后将其稀疏并且没有设置min_period,因为我希望保持结果的健壮性和一致性使用设置窗口

 hey=(DataFrame(np.random.random((15,3)))+.2).cumsum()
 hoo=(DataFrame(np.random.random((15,3)))+.2).cumsum()
 hey_corr= rolling_corr(hey.dropna(),hoo.dropna(), 4)

给了我

In [388]: hey_corr

Out[388]: 

   0        1        2
0  NaN      NaN      NaN
1  NaN      NaN      NaN
2  NaN      NaN      NaN
3  0.991087 0.978383 0.992614
4  0.974117 0.974871 0.989411
5  0.966969 0.972894 0.997427
6  0.942064 0.994681 0.996529
7  0.932688 0.986505 0.991353
8  0.935591 0.966705 0.980186
9  0.969994 0.977517 0.931809
10 0.979783 0.956659 0.923954
11 0.987701 0.959434 0.961002
12 0.907483 0.986226 0.978658
13 0.940320 0.985458 0.967748 
14 0.952916 0.992365 0.973929

现在,当我把它稀疏时,它给了我......

hey.ix[5:8,0] = np.nan
hey.ix[6:10,1] = np.nan
hoo.ix[5:8,0] = np.nan
hoo.ix[6:10,1] = np.nan
hey_corr_sparse = rolling_corr(hey.dropna(),hoo.dropna(), 4)
hey_corr_sparse

Out[398]: 

   0        1        2
0  NaN      NaN      NaN
1  NaN      NaN      NaN
2  NaN      NaN      NaN
3  0.991273 0.992557 0.985773
4  0.953041 0.999411 0.958595
11 0.996801 0.998218 0.992538
12 0.994919 0.998656 0.995235
13 0.994899 0.997465 0.997950
14 0.971828 0.937512 0.994037

缺少数据,看起来我们只有dropna()可以在数据框中形成完整窗口的数据

我可以用丑陋的iter-fudge解决问题如下......

hey_corr_sparse = DataFrame(np.nan, index=hey.index,columns=hey.columns)
for i in hey_corr_sparse.columns:
    hey_corr_sparse.ix[:,i] = rolling_corr(hey.ix[:,i].dropna(),hoo.ix[:,i].dropna(), 4)
hey_corr_sparse

Out[406]: 

   0        1        2
0  NaN      NaN      NaN
1  NaN      NaN      NaN
2  NaN      NaN      NaN
3  0.991273 0.992557 0.985773
4  0.953041 0.999411 0.958595
5  NaN      0.944246 0.961917
6  NaN      NaN      0.941467
7  NaN      NaN      0.963183
8  NaN      NaN      0.980530
9  0.993865 NaN      0.984484
10 0.997691 NaN      0.998441
11 0.978982 0.991095 0.997462
12 0.914663 0.990844 0.998134
13 0.933355 0.995848 0.976262
14 0.971828 0.937512 0.994037

社区中是否有人知道是否有可能使这个数组函数给出这个结果,我试图使用.apply但是画了一个空白,甚至可以.apply一个函数可以工作两个数据结构(在这个例子中是hey和hoo)?

非常感谢,LW

1 个答案:

答案 0 :(得分:2)

你可以试试这个:

>>> def sparse_rolling_corr(ts, other, window):
...     return rolling_corr(ts.dropna(), other[ts.name].dropna(), window).reindex_like(ts)
... 
>>> hey.apply(sparse_rolling_corr, args=(hoo, 4))