我有一个以下格式的Dataframe,我正在尝试创建df [&#39; New&#39;],它是一个旋转值,如下所示,我将用它来计算Alpha和New之间的相关性< / p>
Date Alpha Bravo Charlie New Correlation
2018-01-03 1 3 2 3 (from bravo column) NaN
2018-01-04 2 6 4 6 (from bravo column) NaN
2018-01-05 3 9 6 9 (from bravo column) NaN
2018-01-06 4 12 8 12 (from bravo column) NaN
2018-01-07 5 15 10 10 (from Charlie column) X
下一个日期:
Date Alpha Bravo Charlie New Correlation
2018-01-03 1 3 2 3 (from bravo column) NaN
2018-01-04 2 6 4 6 (from bravo column) NaN
2018-01-05 3 9 6 9 (from bravo column) NaN
2018-01-06 4 12 8 12 (from bravo column) NaN
2018-01-07 5 15 10 15 (from bravo column) X
2018-01-08 6 18 12 12 (from Charlie column) Y
df['Correlation'] = df['Alpha'].rolling(window=5).corr(other=df['New'])
建议如何使用旋转值创建此新列? (这样我之前的相关性将保持不变为X.我的最终目标是获取Correlation列,而New column仅用于计算相关性)
换句话说,每次计算相关列时,它都会使用最新的值作为查理,但之前的所有值都是布拉沃。
另一种解释方法是将始终使用Charlie列的最后日期和过去4天的bravo来计算与Alpha的相关性,如下所示:
答案 0 :(得分:1)
我认为您需要首先添加NaN
s然后this solution添加strides
,然后获取相关矩阵:
def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
N = 5
a = np.concatenate([[np.nan] * (N-1), df['Bravo'].values])
b = np.concatenate([[np.nan] * (N-1), df['Alpha'].values])
a1 = rolling_window(a, N)
a2 = rolling_window(b, N)
删除a1
的最后一列,并添加Charlie
列的值:
c = np.c_[a1[:, :-1], df['Charlie'].values[:, None]]
print (c)
[[nan nan nan nan 2.]
[nan nan nan 3. 4.]
[nan nan 3. 6. 6.]
[nan 3. 6. 9. 8.]
[ 3. 6. 9. 12. 10.]
[ 6. 9. 12. 15. 12.]
[ 9. 12. 15. 18. 15.]]
创建数据框并按NaN
删除第一行iloc
:
a = pd.DataFrame(a2, index=df.index).iloc[N-1:]
b = pd.DataFrame(c, index=df.index).iloc[N-1:]
df['Correlation1'] = a.corrwith(b, axis=1)
#for improve performance
#https://stackoverflow.com/a/41703623/2901002
df['Correlation2'] = corr2_coeff_rowwise(a2, c)
print (df)
Date Alpha Bravo Charlie Correlation1 Correlation2
0 2018-01-03 1 3 2 NaN NaN
1 2018-01-04 2 6 4 NaN NaN
2 2018-01-05 3 9 6 NaN NaN
3 2018-01-06 4 12 8 NaN NaN
4 2018-01-07 5 15 10 0.894427 0.894427
5 2018-01-08 6 18 12 0.832050 0.832050
6 2018-01-09 7 21 15 0.832050 0.832050