按列加入两个pandas数据集(国家/地区代码)

时间:2018-06-03 13:50:07

标签: python pandas dataframe

我希望在df数据框中将国家/地区代码表示为alpha_3_code,在我的字段为Df2 dataframe的Ethnic_Codes中。对于df2中的每一行,我想将Reviewer_Nationality与df中的en_short_name匹配,如果匹配,则将国家/地区代码分配给df2中的Ethnic_Codes。

df2.head()

Nationality_Codes   Reviewer_Nationality    Reviewer_Score
NaN                       Russia                  2.9
NaN                       United Kingdom          7.5
NaN                       Australia               7.1
NaN                       United Kingdom          3.8
NaN                       Russia                  6.7

df.head()

alpha_3_code       en_short_name           nationality
RUS                 Russia                  Russian
ALA                 Åland Islands           Åland Island
ALB                 Albania                 Albanian 
AUS                 Australia               Australian
UK                  United Kingdom          British, UK

最终结果应该是:

df2.head()

Nationality_Codes   Reviewer_Nationality    Reviewer_Score
RUS                       Russia                  2.9
UK                        United Kingdom          7.5
AUS                       Australia               7.1
UK                        United Kingdom          3.8
RUS                       Russia                  6.7

我试过这段代码,但没有用。

for index, row in df.iterrows():
for index2, row2 in df2.iterrows():
    if row2['Reviewer_Nationality']==row['en_short_name']:
        df2['Nationality_Codes'][row2]=df['alpha_3_code'][row2]

任何人都可以帮助我吗?

非常感谢!

2 个答案:

答案 0 :(得分:3)

一种方法是为您的英文名称和代码创建一个系列映射,然后使用.map

#my_map = pd.Series(df.alpha_3_code.values,index=df.en_short_name)
my_map = df.set_index('en_short_name')['alpha_3_code']

df2['Nationality_Codes'] = df2['Reviewer_Nationality'].map(my_map)

<强>输出:

>>> df2
  Nationality_Codes Reviewer_Nationality  Reviewer_Score
0               RUS               Russia             2.9
1                UK       United Kingdom             7.5
2               AUS            Australia             7.1
3                UK       United Kingdom             3.8
4               RUS               Russia             6.7

答案 1 :(得分:1)

试试这个:

merged = df[['alpha_3_code', 'en_short_name']].merge(df2[['Reviewer_Nationality',
                                                    'Reviewer_Score']],
left_on='en_short_name', right_on='Reviewer_Nationality', how='left')]
.rename(columns={'alpha_3_code': 'Nationality_Codes'})\
.drop('en_short_name', axis=1)