根据df2 [“ index”]的值创建新列时遇到问题。我要获取的是基于其实际索引的df1 [“ score”]内部的值。
这使它更容易理解,这是我的两个示例数据帧:
df1= pd.DataFrame({'cluster':[1,2,3,4,5], 'score':[80, 90, 60, 40, 12]})
df2= pd.DataFrame({'word':["hello", "my", "name", "is", "tom"], 'label':["aa", "bb", "cc", "dd", "ee"], 'idx':[1,3,4,4,4]})
这是我期望根据df2的“索引”列和df1的“实际索引”引用分数的结果
df3= pd.DataFrame({'word':["hello", "my", "name", "is", "tom"], 'label':["aa", "bb", "cc", "dd", "ee"], 'idx':[1,3,4,4,4], 'score':[90, 40, 12, 12, 12]})
答案 0 :(得分:1)
使用Series
的{{3}}和由索引值匹配的df1 ['score']:
df2['score'] = df2['idx'].map(df1['score'])
print (df1)
word label idx score
0 hello aa 1 90
1 my bb 3 40
2 name cc 4 12
3 is dd 4 12
4 tom ee 4 12