pandas列值从另一个dataframe值更新

时间:2017-06-28 05:22:36

标签: python pandas

我有以下2个数据框

df_a = 
   id    val 
0  A100  11
1  A101  12
2  A102  13
3  A103  14
4  A104  15


df_b = 
   id    loc  val 
0  A100  12
1  A100  23
2  A100  32
3  A102  21
4  A102  38
5  A102  12
6  A102  18
7  A102  19
..... 

期望的结果:

df_b = 
   id    loc  val 
0  A100  12   11
1  A100  23   11 
2  A100  32   11
3  A102  21   12
4  A102  38   12 
5  A102  12   12
6  A102  18   12
7  A102  19   12 
..... 

当我尝试像df_a的'val'列那样更新df_b的'val'列时,

for index, row in df_a.iterrows():
    v = row['val']
    seq = df_a.loc[df_a['val'] == v] 
    df_b.loc[df_b['val'] == v, 'val'] = seq['val'] 

df_x = df_b.join(df_a, on=['id'], how='inner', lsuffix='_left', rsuffix='_right') 

然而我无法解决这个问题......我怎样才能解决这个棘手的问题?

谢谢

1 个答案:

答案 0 :(得分:3)

您可以map创建的Series使用set_index

df_b['val'] = df_b['id'].map(df_a.set_index('id')['val'])
print (df_b)
     id  loc  val
0  A100   12   11
1  A100   23   11
2  A100   32   11
3  A102   21   13
4  A102   38   13
5  A102   12   13
6  A102   18   13
7  A102   19   13

mergeleft join

df = pd.merge(df_b,df_a, on='id', how='left')

print (df)
     id  loc  val
0  A100   12   11
1  A100   23   11
2  A100   32   11
3  A102   21   13
4  A102   38   13
5  A102   12   13
6  A102   18   13
7  A102   19   13

如果只有一个公共列id用于加入df,则可以省略它。

df = pd.merge(df_b,df_a, how='left')
print (df)
     id  loc  val
0  A100   12   11
1  A100   23   11
2  A100   32   11
3  A102   21   13
4  A102   38   13
5  A102   12   13
6  A102   18   13
7  A102   19   13