我正在尝试从 df 表中附加纬度和经度数据:
dict = {'city':['Wien', 'Prague','Berlin','London','Rome'],
'latitude': [48.20849, 50.08804, 52.52437, 51.50853, 41.89193 ],
'longitude': [16.37208, 14.42076, 13.41053, -0.12574, 12.51133]
}
df = pd.DataFrame(dict)
# creating non duplicated pairs of cities
df_pair = pd.DataFrame(list(combinations(df.city, 2)), columns=['start_city', 'end_city'])
进入 df_pair 的列 start_latitude、start_longitude、end_latitude、end_longitude(将在追加时创建):
start_city end_city
0 Wien Prague
1 Wien Berlin
2 Wien London
3 Wien Rome
4 Prague Berlin
5 Prague London
6 Prague Rome
7 Berlin London
8 Berlin Rome
9 London Rome
所以最终的数据帧(让我们调用 df_pair_geo)看起来像这样:
start_city end_city start_latitude start_longitude end_latitude end_longitude
0 Wien Prague 48.20849 16.37208 50.08804 14.42076
1 Wien Berlin 48.20849 16.37208 52.52437 13.41053
2 Wien London 48.20849 16.37208 51.50853 -0.12574
3 Wien Rome 48.20849 16.37208 41.89193 12.51133
4 Prague Berlin 50.08804 14.42076 52.52437 13.41053
5 Prague London 50.08804 14.42076 51.50853 -0.12574
6 Prague Rome 50.08804 14.42076 41.89193 12.51133
7 Berlin London 52.52437 13.41053 51.50853 -0.12574
8 Berlin Rome 52.52437 13.41053 41.89193 12.51133
9 London Rome 51.50853 -0.12574 41.89193 12.51133
但到目前为止我无法做到这一点。有没有办法做到这一点?谢谢。
答案 0 :(得分:3)
使用合并。
df1 = df_pair.merge(df.set_index('city'), left_on='start_city', right_index=True, how='left')
df2 = df1.merge(df.set_index('city'), left_on='end_city', right_index=True, how='left', suffixes=['_start', '_end'])
# result
print(df2)
start_city end_city latitude_start longitude_start latitude_end \
0 Wien Prague 48.20849 16.37208 50.08804
1 Wien Berlin 48.20849 16.37208 52.52437
2 Wien London 48.20849 16.37208 51.50853
3 Wien Rome 48.20849 16.37208 41.89193
4 Prague Berlin 50.08804 14.42076 52.52437
5 Prague London 50.08804 14.42076 51.50853
6 Prague Rome 50.08804 14.42076 41.89193
7 Berlin London 52.52437 13.41053 51.50853
8 Berlin Rome 52.52437 13.41053 41.89193
9 London Rome 51.50853 -0.12574 41.89193
longitude_end
0 14.42076
1 13.41053
2 -0.12574
3 12.51133
4 13.41053
5 -0.12574
6 12.51133
7 -0.12574
8 12.51133
9 12.51133
答案 1 :(得分:1)
将 DataFrame.join
与 DataFrame.add_suffix
一起使用:
df1 = (df_pair.join(df.set_index('city').add_prefix('start_'), on='start_city')
.join(df.set_index('city').add_prefix('end_'), on='end_city'))
print (df1)
start_city end_city start_latitude start_longitude end_latitude \
0 Wien Prague 48.20849 16.37208 50.08804
1 Wien Berlin 48.20849 16.37208 52.52437
2 Wien London 48.20849 16.37208 51.50853
3 Wien Rome 48.20849 16.37208 41.89193
4 Prague Berlin 50.08804 14.42076 52.52437
5 Prague London 50.08804 14.42076 51.50853
6 Prague Rome 50.08804 14.42076 41.89193
7 Berlin London 52.52437 13.41053 51.50853
8 Berlin Rome 52.52437 13.41053 41.89193
9 London Rome 51.50853 -0.12574 41.89193
end_longitude
0 14.42076
1 13.41053
2 -0.12574
3 12.51133
4 13.41053
5 -0.12574
6 12.51133
7 -0.12574
8 12.51133
9 12.51133