我有两个数据框,第一个是:
id code
1 2
2 3
3 3
4 1
第二个是:
id code name
1 1 Mary
2 2 Ben
3 3 John
我想映射数据框1,使其看起来像:
id code name
1 2 Ben
2 3 John
3 3 John
4 1 Mary
我尝试使用以下代码:
mapping = dict(df2[['code','name']].values)
df1['name'] = df1['code'].map(mapping)
我的映射是正确的,但是映射值都是NAN:
mapping = {1:"Mary", 2:"Ben", 3:"John"}
id code name
1 2 NaN
2 3 NaN
3 3 NaN
4 1 NaN
谁能知道为什么要解决?
答案 0 :(得分:3)
问题是列code
中值的类型不同,因此有必要用astype
将两种类型的值转换为整数或字符串:
print (df1['code'].dtype)
object
print (df2['code'].dtype)
int64
print (type(df1.loc[0, 'code']))
<class 'str'>
print (type(df2.loc[0, 'code']))
<class 'numpy.int64'>
mapping = dict(df2[['code','name']].values)
#same dtypes - integers
df1['name'] = df1['code'].astype(int).map(mapping)
#same dtypes - object (obviously strings)
df2['code'] = df2['code'].astype(str)
mapping = dict(df2[['code','name']].values)
df1['name'] = df1['code'].map(mapping)
print (df1)
id code name
0 1 2 Ben
1 2 3 John
2 3 3 John
3 4 1 Mary
答案 1 :(得分:2)
另一种方式是使用dataframe.merge
df.merge(df2.drop(['id'],1), how='left', on=['code'])
输出:
id code name
0 1 2 Ben
1 2 3 John
2 3 3 John
3 4 1 Mery