将数据从一个 Pandas 数据帧附加到另一个

时间:2021-01-21 08:53:36

标签: python-3.x pandas dataframe

我正在尝试从 df 表中附加纬度和经度数据:

dict = {'city':['Wien', 'Prague','Berlin','London','Rome'], 
        'latitude': [48.20849, 50.08804, 52.52437, 51.50853, 41.89193 ],
        'longitude': [16.37208, 14.42076, 13.41053, -0.12574, 12.51133]
       } 
df = pd.DataFrame(dict)

# creating non duplicated pairs of cities 
df_pair = pd.DataFrame(list(combinations(df.city, 2)), columns=['start_city', 'end_city'])

进入 df_pair 的列 start_latitude、start_longitude、end_latitude、end_longitude(将在追加时创建):

    start_city  end_city
0   Wien        Prague
1   Wien        Berlin
2   Wien        London
3   Wien        Rome
4   Prague      Berlin
5   Prague      London
6   Prague      Rome
7   Berlin      London
8   Berlin      Rome
9   London      Rome

所以最终的数据帧(让我们调用 df_pair_geo)看起来像这样:

  start_city    end_city    start_latitude  start_longitude end_latitude end_longitude
0   Wien        Prague      48.20849        16.37208        50.08804     14.42076
1   Wien        Berlin      48.20849        16.37208        52.52437     13.41053
2   Wien        London      48.20849        16.37208        51.50853     -0.12574
3   Wien        Rome        48.20849        16.37208        41.89193     12.51133
4   Prague      Berlin      50.08804        14.42076        52.52437     13.41053
5   Prague      London      50.08804        14.42076        51.50853     -0.12574
6   Prague      Rome        50.08804        14.42076        41.89193     12.51133
7   Berlin      London      52.52437        13.41053        51.50853     -0.12574
8   Berlin      Rome        52.52437        13.41053        41.89193     12.51133
9   London      Rome        51.50853        -0.12574        41.89193     12.51133

但到目前为止我无法做到这一点。有没有办法做到这一点?谢谢。

2 个答案:

答案 0 :(得分:3)

使用合并。

df1 = df_pair.merge(df.set_index('city'), left_on='start_city', right_index=True, how='left')
df2 =     df1.merge(df.set_index('city'), left_on='end_city',   right_index=True, how='left', suffixes=['_start', '_end'])

# result
print(df2)

  start_city end_city  latitude_start  longitude_start  latitude_end  \
0       Wien   Prague        48.20849         16.37208      50.08804   
1       Wien   Berlin        48.20849         16.37208      52.52437   
2       Wien   London        48.20849         16.37208      51.50853   
3       Wien     Rome        48.20849         16.37208      41.89193   
4     Prague   Berlin        50.08804         14.42076      52.52437   
5     Prague   London        50.08804         14.42076      51.50853   
6     Prague     Rome        50.08804         14.42076      41.89193   
7     Berlin   London        52.52437         13.41053      51.50853   
8     Berlin     Rome        52.52437         13.41053      41.89193   
9     London     Rome        51.50853         -0.12574      41.89193   

   longitude_end  
0       14.42076  
1       13.41053  
2       -0.12574  
3       12.51133  
4       13.41053  
5       -0.12574  
6       12.51133  
7       -0.12574  
8       12.51133  
9       12.51133  

答案 1 :(得分:1)

DataFrame.joinDataFrame.add_suffix 一起使用:

df1 = (df_pair.join(df.set_index('city').add_prefix('start_'), on='start_city')
              .join(df.set_index('city').add_prefix('end_'), on='end_city'))

print (df1)
  start_city end_city  start_latitude  start_longitude  end_latitude  \
0       Wien   Prague        48.20849         16.37208      50.08804   
1       Wien   Berlin        48.20849         16.37208      52.52437   
2       Wien   London        48.20849         16.37208      51.50853   
3       Wien     Rome        48.20849         16.37208      41.89193   
4     Prague   Berlin        50.08804         14.42076      52.52437   
5     Prague   London        50.08804         14.42076      51.50853   
6     Prague     Rome        50.08804         14.42076      41.89193   
7     Berlin   London        52.52437         13.41053      51.50853   
8     Berlin     Rome        52.52437         13.41053      41.89193   
9     London     Rome        51.50853         -0.12574      41.89193   

   end_longitude  
0       14.42076  
1       13.41053  
2       -0.12574  
3       12.51133  
4       13.41053  
5       -0.12574  
6       12.51133  
7       -0.12574  
8       12.51133  
9       12.51133