我有以下2个数据框
df_a =
id val
0 A100 11
1 A101 12
2 A102 13
3 A103 14
4 A104 15
df_b =
id loc val
0 A100 12
1 A100 23
2 A100 32
3 A102 21
4 A102 38
5 A102 12
6 A102 18
7 A102 19
.....
期望的结果:
df_b =
id loc val
0 A100 12 11
1 A100 23 11
2 A100 32 11
3 A102 21 12
4 A102 38 12
5 A102 12 12
6 A102 18 12
7 A102 19 12
.....
当我尝试像df_a的'val'列那样更新df_b的'val'列时,
for index, row in df_a.iterrows():
v = row['val']
seq = df_a.loc[df_a['val'] == v]
df_b.loc[df_b['val'] == v, 'val'] = seq['val']
或
df_x = df_b.join(df_a, on=['id'], how='inner', lsuffix='_left', rsuffix='_right')
然而我无法解决这个问题......我怎样才能解决这个棘手的问题?
谢谢
答案 0 :(得分:3)
df_b['val'] = df_b['id'].map(df_a.set_index('id')['val'])
print (df_b)
id loc val
0 A100 12 11
1 A100 23 11
2 A100 32 11
3 A102 21 13
4 A102 38 13
5 A102 12 13
6 A102 18 13
7 A102 19 13
或merge
与left join
:
df = pd.merge(df_b,df_a, on='id', how='left')
print (df)
id loc val
0 A100 12 11
1 A100 23 11
2 A100 32 11
3 A102 21 13
4 A102 38 13
5 A102 12 13
6 A102 18 13
7 A102 19 13
如果只有一个公共列id
用于加入df
,则可以省略它。
df = pd.merge(df_b,df_a, how='left')
print (df)
id loc val
0 A100 12 11
1 A100 23 11
2 A100 32 11
3 A102 21 13
4 A102 38 13
5 A102 12 13
6 A102 18 13
7 A102 19 13