对具有不同索引的两个数据帧进行大熊猫计算

时间:2017-06-12 14:31:44

标签: python pandas dataframe

有两个具有不同索引但具有匹配列的数据帧,我如何计算它们之间的差异?

例如,使用

df1 = pd.DataFrame({ 'a': (188, 750, 1330, 1385, 188, 750, 1330, 1385),
                    'b': (51.12, 51.45, 74.49, 29.21, 39.98, 3.98, 14.46, 16.51),
                    'c': pd.Categorical(['R', 'R', 'R', 'R', 'F', 'F', 'F', 'F']) })
df1 = df1.set_index(['a'])

          b  c
a             
188   51.12  R
750   51.45  R
1330  74.49  R
1385  29.21  R
188   39.98  F
750    3.98  F
1330  14.46  F
1385  16.51  F


df2 = pd.DataFrame({ 'x': (20, 50),
                     'c': pd.Categorical(['R', 'F']) })
df2 = df2.set_index(['c'])

    x
c    
R  20
F  50

我希望根据{{1}中b列的条件,将df1中的x列与df2 c区分开来。应该匹配df1上的索引c

结果如下:

df2

3 个答案:

答案 0 :(得分:3)

您可以使用joinmap

df1['diff'] = df1['b'] - df1.join(df2, on='c')['x']
print (df1)
          b  c   diff
a                    
188   51.12  R  31.12
750   51.45  R  31.45
1330  74.49  R  54.49
1385  29.21  R   9.21
188   39.98  F -10.02
750    3.98  F -46.02
1330  14.46  F -35.54
1385  16.51  F -33.49

或者:

df1['diff'] = df1['b'] - df1['c'].map(df2['x'])
print (df1)
          b  c   diff
a                    
188   51.12  R  31.12
750   51.45  R  31.45
1330  74.49  R  54.49
1385  29.21  R   9.21
188   39.98  F -10.02
750    3.98  F -46.02
1330  14.46  F -35.54
1385  16.51  F -33.49

答案 1 :(得分:2)

df1.assign(diff = df1['b'] - df1['c'].map(df2.squeeze()))

输出:

          b  c   diff
a                    
188   51.12  R  31.12
750   51.45  R  31.45
1330  74.49  R  54.49
1385  29.21  R   9.21
188   39.98  F -10.02
750    3.98  F -46.02
1330  14.46  F -35.54
1385  16.51  F -33.49

答案 2 :(得分:1)

df1["diff"] = df1.apply(lambda x: x.b - df2.loc[x.c].values[0],axis=1)