我正在尝试使用半正式公式计算两对纬度/经度之间的距离。我正在为最后两个函数参数使用一个系列,因为我试图计算这个我存储在两个pandas列中的多个坐标。我收到以下错误TypeError: ("'Series' object is not callable", u'occurred at index 0')
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from math import radians, cos, sin, asin, sqrt
origin_lat = 51.507200
origin_lon = -0.127500
def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * np.arcsin(np.sqrt(a))
r = 6371 # Radius of earth in kilometers. Use 3956 for miles
return c * r
df['dist_from_org'] = df.apply(haversine(origin_lon, origin_lat, df['ulong'], df['ulat']), axis=1)
df的系列看起来像这样:
+----+---------+----------+
| | ulat | ulong |
+----+---------+----------+
| 0 | 52.6333 | 1.30000 |
| 1 | 51.4667 | -0.35000 |
| 2 | 51.5084 | -0.12550 |
| 3 | 51.8833 | 0.56670 |
| 4 | 51.7667 | -1.38330 |
| 5 | 55.8667 | -2.10000 |
| 6 | 55.8667 | -2.10000 |
| 7 | 52.4667 | -1.91670 |
| 8 | 51.8833 | 0.90000 |
| 9 | 53.4083 | -2.14940 |
| 10 | 53.0167 | -1.73330 |
| 11 | 51.4667 | -0.35000 |
| 12 | 51.4667 | -0.35000 |
| 13 | 52.7167 | -1.36670 |
| 14 | 51.4667 | -0.35000 |
| 15 | 52.9667 | -1.16667 |
| 16 | 51.4667 | -0.35000 |
| 17 | 51.8833 | 0.56670 |
| 18 | 51.8833 | 0.56670 |
| 19 | 51.4833 | 0.08330 |
| 20 | 52.0833 | 0.58330 |
| 21 | 52.3000 | -0.70000 |
| 22 | 51.4000 | -0.05000 |
| 23 | 51.9333 | -2.10000 |
| 24 | 51.9000 | -0.43330 |
| 25 | 53.4809 | -2.23740 |
| 26 | 51.4853 | -3.18670 |
| 27 | 51.2000 | -1.48333 |
| 28 | 51.7779 | -3.21170 |
| 29 | 51.4667 | -0.35000 |
| 30 | 51.7167 | -0.28330 |
| 31 | 52.2000 | 0.11670 |
| 32 | 52.4167 | -1.55000 |
| 33 | 56.5000 | -2.96670 |
| 34 | 51.2167 | -1.05000 |
| 35 | 51.8964 | -2.07830 |
+----+---------+----------+
我不允许在pd.apply函数中使用系列吗?如果是这样,我如何逐行应用函数并将输出分配给新列?
答案 0 :(得分:1)
调用该函数时,您不需要使用apply。只需使用:
df['dist_from_org'] = haversine(origin_lon, origin_lat, df['ulong'], df['ulat'])
当我运行你的代码时(使用origin_lon的标量值,origin_lat,我得到TypeError:无法将系列转换为。这是由赋值a = ...
我重新设计了适用于系列的公式:
a = dlat.divide(2).apply(sin).pow(2)
+ lat1.apply(cos).multiply(lat2.apply(cos).multiply(dlon.divide(2).apply(sin).pow(2)))
请告诉我这是否适合您。
如果origin_lon和origin_lat是常量(而不是系列),则使用以下公式:
a = dlat.divide(2).apply(sin).pow(2) + cos(lat1) * lat2.apply(cos).multiply(dlon.divide(2).apply(sin).pow(2))
因为参数lon2和lat2是Pandas系列,所以dlon和dlat也都是Series对象。然后,您需要在系列中使用apply将该函数应用于列表中的每个元素。