这是数据帧-
load lmp
read_year read_month trading_block
2017 3 0 0.018033 27.832902
1 0.023771 34.044462
4 0 0.017487 25.136200
1 0.023570 33.487529
5 0 0.018008 24.085170
1 0.024557 36.357774
6 0 0.021342 22.528570
1 0.028840 31.127481
7 0 0.022381 24.076738
1 0.031395 37.653610
8 0 0.021408 22.171804
1 0.030574 32.599279
9 0 0.019850 24.391908
1 0.027178 39.192316
10 0 0.017754 25.593717
1 0.023717 34.941795
11 -1 0.014916 18.443703
0 0.015961 25.708624
1 0.020092 33.650612
12 0 0.016170 28.675776
1 0.020008 36.851096
2018 1 0 0.015894 49.115699
1 0.019224 59.492227
2 0 0.015765 23.719127
1 0.019607 29.572859
3 0 0.016970 29.240378
1 0.021500 36.516138
4 0 0.016267 31.317317
1 0.022204 39.404220
5 0 0.017652 27.454792
1 0.024314 41.900247
索引部分是让我失望的地方。我最终需要的是这样的东西-
trading_block read_month Correlation Coefficient
0 1 0.740597
0 2 0.744560
0 3 0.300000
0 4 0.325736
0 5 0.300000
0 6 0.846745
0 7 0.784101
0 8 0.684961
0 9 0.796357
0 10 0.758172
0 11 0.577991
0 12 0.684050
1 1 0.556274
1 2 0.328713
1 3 0.300000
1 4 0.300000
1 5 0.300000
1 6 0.639870
1 7 0.591472
1 8 0.658894
1 9 0.615737
1 10 0.500315
1 11 0.300000
1 12 0.346552
我之前已经做过数学,尽管过于复杂,而且有一种简单的方法可以做到,我只是不知道它是什么。我假设我需要一个groupyby
函数或类似的东西。
这是等式-
X线是每个月reading
的平均值,而trading_block
是0
或1
,如下所示-
hour_ending read_date read_month read_year reading trading_block
0 1 2017-03-23 3 2017 0.019582 0
1 2 2017-03-23 3 2017 0.019460 0
2 3 2017-03-23 3 2017 0.018888 0
3 4 2017-03-23 3 2017 0.018940 0
4 5 2017-03-23 3 2017 0.019114 0
5 6 2017-03-23 3 2017 0.020050 0
6 7 2017-03-23 3 2017 0.022545 0
7 8 2017-03-23 3 2017 0.024053 1
8 9 2017-03-23 3 2017 0.026028 1
9 10 2017-03-23 3 2017 0.027726 1
10 11 2017-03-23 3 2017 0.029251 1
11 12 2017-03-23 3 2017 0.028887 1
12 13 2017-03-23 3 2017 0.027397 1
13 14 2017-03-23 3 2017 0.027536 1
14 15 2017-03-23 3 2017 0.026253 1
15 16 2017-03-23 3 2017 0.025872 1
16 17 2017-03-23 3 2017 0.024746 1
17 18 2017-03-23 3 2017 0.023481 1
18 19 2017-03-23 3 2017 0.022701 1
19 20 2017-03-23 3 2017 0.023377 1