Question

我有2个数据框，第一个：

City      Country
-----------------
NY            US
LA            US
Paris         France
Roma          Italy

第二个：

Place        Score         ID_ref
----------------------------------
Paris         +1            0010 
US            +5            1000
Italy         -8            3020

输出应为：

Place        Score            ID_ref
------------------------------------
Paris         +1            0010 
France        +1            0010 
US            +5            1000
LA            +5            1000
NY            +5            1000
Italy         -8            3020
Roma          -8            3020

我想通过双循环来解决它，但是第一个数据帧有5 000行，第二个数据帧有25000行。

所以我认为最好不要做双循环。

Answer 1

尝试一下，因为您正在合并“国家”或“城市”：

df_out = pd.concat(
   [df_p.set_index(df_p['Country']).rename_axis('Place', axis=0).stack().reset_index().merge(df_2),
    df_p.set_index(df_p['City']).rename_axis('Place', axis=0).stack().reset_index().merge(df_2)],
   ignore_index=True).drop_duplicates()[['Place','Score','ID_ref']]
df_out['Location'] = df_out['Location'].fillna(df_out['Place'])
df_out = df_out[['Location','Score','ID_ref']]

df_out

输出：

  Location  Score  ID_ref
0       NY      5    1000
1       US      5    1000
2       LA      5    1000
4     Roma     -8    3020
5    Italy     -8    3020
6    Paris      1      10
7    Paris      1      10

Answer 2

先尝试合并，然后按如下所示进行合并：

join1 = pd.merge(df, df1, left_on='Country', right_on='Place', how='left')
del join1['Place']
del join1['Country']
join1.columns = ['Place', 'Score', 'ID_ref']
result = join1.concat(df1)

将列中的现有值检查到另一个数据框并添加行

2 个答案: