我有月度和双年度数据,我想加入。
每月:
foo foobar
date INCAGG
2003-01-01 1 7.892858 7.623477e+07
3 123.995220 2.120104e+08
5 133.645028 3.124879e+08
7 792.390234 5.401223e+08
2003-02-01 1 175.326590 7.037367e+07
3 295.189979 3.515387e+08
5 704.893690 3.301345e+08
7 174.118220 6.025263e+08
2003-03-01 1 2068.875565 6.646029e+07
3 213.663057 1.821990e+08
5 2507.293175 2.017673e+08
7 433.253711 4.542890e+08
2003-04-01 1 79.069296 3.253000e+07
3 38.485372 5.502446e+07
5 170.548422 6.304233e+08
7 1363.115717 4.413133e+08
双年刊:
foobar
date INCAGG
2003-01-01 1 0.113312
3 0.167293
5 0.283961
7 0.346094
9 0.089340
2005-01-01 1 0.119631
3 0.155010
5 0.301366
7 0.332117
9 0.091877
直接连接只会与Januarys的第一个相匹配 - 加入这些连接的正确方法是什么?
答案 0 :(得分:1)
您希望在多年没有进行一年两次测量的情况下发生什么?最后可用还是没有?
df1.reset_index(inplace=True)
df2.reset_index(inplace=True)
merged = pd.merge(df1,df2,left_on=[df1.date.dt.year,'INCAGG'],
right_on=[df2.date.dt.year,'INCAGG'],suffixes=['','R'],
how='left').set_index(['date','INCAGG'])
del merged['dateR']
输出:
foo foobar foobarR
date INCAGG
2003-01-01 1 7.892858 7.623477e+07 0.113312
3 123.995220 2.120104e+08 0.167293
5 133.645028 3.124879e+08 0.283961
7 792.390234 5.401223e+08 0.346094
2003-02-01 1 175.326590 7.037367e+07 0.113312
3 295.189979 3.515387e+08 0.167293
5 704.893690 3.301345e+08 0.283961
7 174.118220 6.025263e+08 0.346094
2003-03-01 1 2068.875565 6.646029e+07 0.113312
3 213.663057 1.821990e+08 0.167293
5 2507.293175 2.017673e+08 0.283961
7 433.253711 4.542890e+08 0.346094
2003-04-01 1 79.069296 3.253000e+07 0.113312
3 38.485372 5.502446e+07 0.167293
5 170.548422 6.304233e+08 0.283961
7 1363.115717 4.413133e+00 0.346094