基于索引和列连接两个Dataframe

时间:2017-05-17 13:47:40

标签: python join dataframe merge

我有以下两个数据帧:

DF1:

            Id
date          
2014-03-13   1
2014-03-14   2
2014-03-15   1

DF2:

            Id people  value
date                        
2014-03-13   1      A   -3.0
2014-03-13   1      B   -6.0
2014-03-13   4      C   -3.2
2014-03-14   1      A   -3.1
2014-03-14   2      B   -5.0
2014-03-14   2      C   -3.4
2014-03-14   7      D   -6.2
2014-03-14   8      E   -3.2
2014-03-15   1      A   -3.2
2014-03-15   3      B   -5.9

我想要做的是根据Id合并这两个数据框,与索引(date)一致。

预期结果如下:

            Id people  value
date                        
2014-03-13   1      A   -3.0
2014-03-13   1      B   -6.0
2014-03-14   2      B   -5.0
2014-03-14   2      C   -3.4
2014-03-15   1      A   -3.2

我努力使用mergejoin,但没有成功。

生成输入的代码如下:

import pandas as pd

dates = [pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d'), pd.to_datetime('2014-03-15', format='%Y-%m-%d')]
Ids = [1,2,1]
df1 = pd.DataFrame({'Id': pd.Series(Ids, index=dates)})
df1.index.name = 'date'

dates = [pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-13', format='%Y-%m-%d'),
         pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d'),
         pd.to_datetime('2014-03-14', format='%Y-%m-%d'),pd.to_datetime('2014-03-14', format='%Y-%m-%d'),
         pd.to_datetime('2014-03-14', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d'), 
         pd.to_datetime('2014-03-15', format='%Y-%m-%d'), pd.to_datetime('2014-03-15', format='%Y-%m-%d')]
Ids = [1,1,4,1,2,2,7,8,1,3]
peoples = ['A','B','C','A','B','C','D','E','A','B']
values = [-3,-6,-3.2,-3.1,-5,-3.4,-6.2,-3.2,-3.2,-5.9]
df2 = pd.DataFrame({'Id': pd.Series(Ids, index=dates),
                    'people': pd.Series(peoples, index=dates),
                    'value': pd.Series(values, index=dates)})
df2.index.name = 'date'

1 个答案:

答案 0 :(得分:1)

最简单的是merge + reset_index

df = pd.merge(df1.reset_index(), df2.reset_index(), on=['date','Id']).set_index('date')
print (df)
date        Id people  value
2014-03-13   1      A   -3.0
2014-03-13   1      A   -6.0
2014-03-14   2      B   -5.0
2014-03-14   2      A   -3.4
2014-03-15   1      F   -3.2

还有:

df = pd.merge(df1.set_index('Id', append=True), 
             df2.set_index('Id', append=True), 
             left_index=True, 
             right_index=True)
print (df)
              people  value
date       Id              
2014-03-13 1       A   -3.0
           1       A   -6.0
2014-03-14 2       B   -5.0
           2       A   -3.4
2014-03-15 1       F   -3.2