Question

我有两个数据帧： 1）包含供应商列表及其Lat，Long坐标

sup_essential = pd.DataFrame({'supplier': ['A','B','C'],
                              'coords': [(51.1235,-0.3453),(52.1245,-0.3423),(53.1235,-1.4553)]})

2）商店列表及其纬度，长坐标

stores_essential = pd.DataFrame({'storekey': [1,2,3],
                              'coords': [(54.1235,-0.6553),(49.1245,-1.3423),(50.1235,-1.8553)]})

我想创建一个输出表，其中包含：store，store_coordinates，supplier，supplier_coordinates，每个商店和供应商组合的距离。

我目前有：

test=[]
for row in sup_essential.iterrows():
    for row in stores_essential.iterrows():
        r = sup_essential['supplier'],stores_essential['storeKey']
        test.append(r)

但这只是重复了所有的值

Answer 1

来源DF

from sklearn.neighbors import DistanceMetric
dist = DistanceMetric.get_metric('haversine')

m = pd.merge(sup.assign(x=0), stores.assign(x=0), on='x', suffixes=['1','2']).drop('x',1)

d1 = sup[['coords']].assign(lat=sup.coords.str[0], lon=sup.coords.str[1]).drop('coords',1)
d2 = stores[['coords']].assign(lat=stores.coords.str[0], lon=stores.coords.str[1]).drop('coords',1)

m['dist_km'] = np.ravel(dist.pairwise(np.radians(d1), np.radians(d2)) * 6367)
## -- End pasted text --

<强>解决方案：

In [135]: m
Out[135]:
              coords1 supplier             coords2  storekey     dist_km
0  (51.1235, -0.3453)        A  (54.1235, -0.6553)         1  334.029670
1  (51.1235, -0.3453)        A  (49.1245, -1.3423)         2  233.213416
2  (51.1235, -0.3453)        A  (50.1235, -1.8553)         3  153.880680
3  (52.1245, -0.3423)        B  (54.1235, -0.6553)         1  223.116901
4  (52.1245, -0.3423)        B  (49.1245, -1.3423)         2  340.738587
5  (52.1245, -0.3423)        B  (50.1235, -1.8553)         3  246.116984
6  (53.1235, -1.4553)        C  (54.1235, -0.6553)         1  122.997130
7  (53.1235, -1.4553)        C  (49.1245, -1.3423)         2  444.459052
8  (53.1235, -1.4553)        C  (50.1235, -1.8553)         3  334.514028

<强>结果：

{{1}}

迭代多个数据帧pandas

1 个答案: