有2个数据帧: 需要使用' Place'要替换的ref表(或添加一列来表示)' Region'在df by' Code'参考 请注意,这是一个示例,在实际文件中,有100,000多个角色,以及更复杂的值。请帮忙
df=pd.DataFrame({'Date': ['1/1/11','1/2/11','1/2/11','1/2/11','1/3/11','1/3/11','1/3/11','1/3/11','1/4/11','1/5/11','1/5/11','1/5/11'],\
'Prod': ['Quad','Bellen','Quad','Bellen','Sunshine','Carlota','Sunset','Sunshine','Sunset','Sunset','Sunshine','Carlota'], \
'Region': ['East','South','West','West','East','MidWest','South','South','MidWest','South','West','West']})
ref=pd.DataFrame({'Place': ['West','East','South','MidWest'],\
'Code':['W','E','S','MW']})
答案 0 :(得分:2)
你需要地图
df['Region'] = df['Region'].map(ref.set_index('Place')['Code'])
Date Prod Region
0 1/1/11 Quad E
1 1/2/11 Bellen S
2 1/2/11 Quad W
3 1/2/11 Bellen W
4 1/3/11 Sunshine E
5 1/3/11 Carlota MW
6 1/3/11 Sunset S
7 1/3/11 Sunshine S
8 1/4/11 Sunset MW
9 1/5/11 Sunset S
10 1/5/11 Sunshine W
11 1/5/11 Carlota W
编辑:如果要保留ref中不存在的区域名称,请使用
df['Region'] = df['Region'].map(ref.set_index('Place')['Code']).combine_first(df['Region'])
编辑:@Wen是对的,您可以使用替换而不是地图
df['Region'].replace(ref.set_index('Place')['Code'])