我想在数据帧的每一列中添加一个mean和std列。不幸的是,我的代码用mean和std替换原始列。
np.random.seed(50)
df = pd.DataFrame(np.random.randint(0,9,size=(30, 3)), columns=list('ABC'))
print df
DF
A B C
0 0 0 1
1 4 6 5
2 6 6 5
3 2 7 4
4 3 6 4
5 1 5 0
6 6 3 2
7 3 3 3
8 2 0 3
9 2 0 3
10 0 0 7
11 3 8 7
12 4 4 0
13 0 3 3
14 1 4 5
15 7 0 3
16 5 6 1
17 4 4 4
18 5 4 6
19 3 0 5
20 8 3 6
21 2 8 8
22 5 4 7
23 8 4 4
24 2 1 8
25 7 1 5
26 8 3 3
27 5 3 6
28 8 6 0
29 8 2 1
以下是我的代码:
https://pandas.pydata.org/pandas-docs/stable/computation.html
r = df.rolling(window=5)
print 'Agg mean and sdt df'
print r['A', 'B', 'C'].agg([np.mean, np.std])
print
输出
Agg mean和sdt df
A B C
mean std mean std mean std
0 NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN
4 3.0 2.236068 5.0 2.828427 3.8 1.643168
5 3.2 1.923538 6.0 0.707107 3.6 2.073644
6 3.6 2.302173 5.4 1.516575 3.0 2.000000
7 3.0 1.870829 4.8 1.788854 2.6 1.673320
8 3.0 1.870829 3.4 2.302173 2.4 1.516575
9 2.8 1.923538 2.2 2.167948 2.2 1.303840
10 2.6 2.190890 1.2 1.643168 3.6 1.949359
11 2.0 1.224745 2.2 3.492850 4.6 2.190890
12 2.2 1.483240 2.4 3.577709 4.0 3.000000
13 1.8 1.788854 3.0 3.316625 4.0 3.000000
14 1.6 1.816590 3.8 2.863564 4.4 2.966479
15 3.0 2.738613 3.8 2.863564 3.6 2.607681
16 3.4 2.880972 3.4 2.190890 2.4 1.949359
17 3.4 2.880972 3.4 2.190890 3.2 1.483240
18 4.4 2.190890 3.6 2.190890 3.8 1.923538
19 4.8 1.483240 2.8 2.683282 3.8 1.923538
20 5.0 1.870829 3.4 2.190890 4.4 2.073644
21 4.4 2.302173 3.8 2.863564 5.8 1.483240
22 4.6 2.302173 3.8 2.863564 6.4 1.140175
23 5.2 2.774887 3.8 2.863564 6.0 1.581139
24 5.0 3.000000 4.0 2.549510 6.6 1.673320
25 4.8 2.774887 3.6 2.880972 6.4 1.816590
26 6.0 2.549510 2.6 1.516575 5.4 2.073644
27 6.0 2.549510 2.4 1.341641 5.2 1.923538
28 6.0 2.549510 2.8 2.049390 4.4 3.049590
29 7.2 1.303840 3.0 1.870829 3.0 2.549510
我正在寻找的是列(和数据):
A A_mean A_std B B_mean B_std C C_mean C_std
我找不到添加'的解决方案。这些专栏。
感谢您的建议。
答案 0 :(得分:1)
In [18]: res = df.rolling(5).agg(['mean','std'])
In [19]: res.columns = res.columns.map('_'.join)
In [54]: cols = np.concatenate(list(zip(df.columns, res.columns[0::2], res.columns[1::2])))
In [55]: cols
Out[55]:
array(['A', 'A_mean', 'A_std', 'B', 'B_mean', 'B_std', 'C', 'C_mean', 'C_std'],
dtype='<U6')
In [56]: res.join(df).loc[:, cols]
Out[56]:
A A_mean A_std B B_mean B_std C C_mean C_std
0 0 NaN NaN 0 NaN NaN 1 NaN NaN
1 4 NaN NaN 6 NaN NaN 5 NaN NaN
2 6 NaN NaN 6 NaN NaN 5 NaN NaN
3 2 NaN NaN 7 NaN NaN 4 NaN NaN
4 3 3.0 2.236068 6 5.0 2.828427 4 3.8 1.643168
5 1 3.2 1.923538 5 6.0 0.707107 0 3.6 2.073644
6 6 3.6 2.302173 3 5.4 1.516575 2 3.0 2.000000
7 3 3.0 1.870829 3 4.8 1.788854 3 2.6 1.673320
8 2 3.0 1.870829 0 3.4 2.302173 3 2.4 1.516575
9 2 2.8 1.923538 0 2.2 2.167948 3 2.2 1.303840
10 0 2.6 2.190890 0 1.2 1.643168 7 3.6 1.949359
11 3 2.0 1.224745 8 2.2 3.492850 7 4.6 2.190890
12 4 2.2 1.483240 4 2.4 3.577709 0 4.0 3.000000
13 0 1.8 1.788854 3 3.0 3.316625 3 4.0 3.000000
14 1 1.6 1.816590 4 3.8 2.863564 5 4.4 2.966479
15 7 3.0 2.738613 0 3.8 2.863564 3 3.6 2.607681
16 5 3.4 2.880972 6 3.4 2.190890 1 2.4 1.949359
17 4 3.4 2.880972 4 3.4 2.190890 4 3.2 1.483240
18 5 4.4 2.190890 4 3.6 2.190890 6 3.8 1.923538
19 3 4.8 1.483240 0 2.8 2.683282 5 3.8 1.923538
20 8 5.0 1.870829 3 3.4 2.190890 6 4.4 2.073644
21 2 4.4 2.302173 8 3.8 2.863564 8 5.8 1.483240
22 5 4.6 2.302173 4 3.8 2.863564 7 6.4 1.140175
23 8 5.2 2.774887 4 3.8 2.863564 4 6.0 1.581139
24 2 5.0 3.000000 1 4.0 2.549510 8 6.6 1.673320
25 7 4.8 2.774887 1 3.6 2.880972 5 6.4 1.816590
26 8 6.0 2.549510 3 2.6 1.516575 3 5.4 2.073644
27 5 6.0 2.549510 3 2.4 1.341641 6 5.2 1.923538
28 8 6.0 2.549510 6 2.8 2.049390 0 4.4 3.049590
29 8 7.2 1.303840 2 3.0 1.870829 1 3.0 2.549510