我有两个系列具有相同的长度和数据类型。两者都是float64。唯一的区别是索引都是日期,但是一个日期是在月初,另一个是在月末。如何在具有不同索引的系列或数据帧上进行相关或协方差等计算?
import numpy as np
from pandas import Series, DataFrame
import pandas as pd
import Quandl
IPO=Quandl.get("RITTER/US_IPO_STATS", authtoken="api key")
ir=Quandl.get("FRBC/REALRT", authtoken="api key")
ipo_splice=IPO[264:662]
new_ipo=ipo_splice['Gross Number of IPOs'];
new_ipo=new_ipo.T
ir_splice=ir[0:398]
new_ir=ir_splice['RR 1 Month']
new_ir=new_ir.T
new_ipo.corr(new_ir)
答案 0 :(得分:0)
reset_index(drop=True)
关于你要关联的事情,然后连续。
s1 = pd.DataFrame(np.random.rand(10), list('abcdefghij'), columns=['s1'])
s2 = pd.DataFrame(np.random.rand(10), list('ABCDEFGHIJ'), columns=['s2'])
print pd.concat([s.reset_index(drop=True) for s in [s1, s2]], axis=1).corr()
s1 s2
s1 1.000000 -0.437945
s2 -0.437945 1.000000
答案 1 :(得分:0)
你可以使用resample()函数重新取样你的一个指数(我们的目标是指数BoM或EoM):
数据:强>
In [63]: df_bom
Out[63]:
val
2015-01-01 76
2015-02-01 27
2015-03-01 65
2015-04-01 71
2015-05-01 9
2015-06-01 23
2015-07-01 52
2015-08-01 10
2015-09-01 62
2015-10-01 25
In [64]: df_eom
Out[64]:
val
2015-01-31 87
2015-02-28 16
2015-03-31 85
2015-04-30 4
2015-05-31 37
2015-06-30 63
2015-07-31 3
2015-08-31 73
2015-09-30 81
2015-10-31 69
<强>解决方案:强>
In [61]: df_eom.resample('MS') + df_bom
C:\envs\py35\Scripts\ipython:1: FutureWarning: .resample() is now a deferred operation
use .resample(...).mean() instead of .resample(...)
Out[61]:
val
2015-01-01 163
2015-02-01 43
2015-03-01 150
2015-04-01 75
2015-05-01 46
2015-06-01 86
2015-07-01 55
2015-08-01 83
2015-09-01 143
2015-10-01 94
In [62]: df_eom.resample('MS').join(df_bom, lsuffix='_lft')
C:\envs\py35\Scripts\ipython:1: FutureWarning: .resample() is now a deferred operation
use .resample(...).mean() instead of .resample(...)
Out[62]:
val_lft val
2015-01-01 87 76
2015-02-01 16 27
2015-03-01 85 65
2015-04-01 4 71
2015-05-01 37 9
2015-06-01 63 23
2015-07-01 3 52
2015-08-01 73 10
2015-09-01 81 62
2015-10-01 69 25
替代方法 - 按year
和month
部分合并DF:
In [69]: %paste
(pd.merge(df_bom, df_eom,
left_on=[df_bom.index.year, df_bom.index.month],
right_on=[df_eom.index.year, df_eom.index.month],
suffixes=('_bom','_eom')))
## -- End pasted text --
Out[69]:
key_0 key_1 val_bom val_eom
0 2015 1 76 87
1 2015 2 27 16
2 2015 3 65 85
3 2015 4 71 4
4 2015 5 9 37
5 2015 6 23 63
6 2015 7 52 3
7 2015 8 10 73
8 2015 9 62 81
9 2015 10 25 69
<强>设定:强>
In [59]: df_bom = pd.DataFrame({'val':np.random.randint(0,100, 10)}, index=pd.date_range('2015-01-01', periods=10, freq='MS'))
In [60]: df_eom = pd.DataFrame({'val':np.random.randint(0,100, 10)}, index=pd.date_range('2015-01-01', periods=10, freq='M'))