我有2个数据框,第一个:
City Country
-----------------
NY US
LA US
Paris France
Roma Italy
第二个:
Place Score ID_ref
----------------------------------
Paris +1 0010
US +5 1000
Italy -8 3020
输出应为:
Place Score ID_ref
------------------------------------
Paris +1 0010
France +1 0010
US +5 1000
LA +5 1000
NY +5 1000
Italy -8 3020
Roma -8 3020
我想通过双循环来解决它,但是第一个数据帧有5 000行,第二个数据帧有25000行。
所以我认为最好不要做双循环。
答案 0 :(得分:0)
尝试一下,因为您正在合并“国家”或“城市”:
df_out = pd.concat(
[df_p.set_index(df_p['Country']).rename_axis('Place', axis=0).stack().reset_index().merge(df_2),
df_p.set_index(df_p['City']).rename_axis('Place', axis=0).stack().reset_index().merge(df_2)],
ignore_index=True).drop_duplicates()[['Place','Score','ID_ref']]
df_out['Location'] = df_out['Location'].fillna(df_out['Place'])
df_out = df_out[['Location','Score','ID_ref']]
df_out
输出:
Location Score ID_ref
0 NY 5 1000
1 US 5 1000
2 LA 5 1000
4 Roma -8 3020
5 Italy -8 3020
6 Paris 1 10
7 Paris 1 10
答案 1 :(得分:0)
先尝试合并,然后按如下所示进行合并:
join1 = pd.merge(df, df1, left_on='Country', right_on='Place', how='left')
del join1['Place']
del join1['Country']
join1.columns = ['Place', 'Score', 'ID_ref']
result = join1.concat(df1)