所以我的数据框大约有5列。其中2个是元组形式的经度和晶格对。所以我有另一个用户定义的函数来计算两个给定的lon / lat元组之间的距离。
data_all['gc_distance'] = ""
### let's start calculate the great circle distance
for idx, row in data_all.iterrows():
row['gc_distance'] = gcd.dist(row['ping_location'], row['destination'])
print(row)
所以基本上,我创建了一个名为gc_distance的空列,然后我遍历每一行来计算距离。当我打印每一行时,数据看起来很棒;
一行打印样本:
created_at_des 2018-01-17 18:55:55.154000
location_missing 0
ping_location (-121.9419444444, 37.4897222222)
destination (-122.15057, 37.39465)
gc_distance 23.85 km
Name: 393529, dtype: object
如您所见,gc_distance DOES具有值。
这是循环后print语句的示例输出:
location_missing ping_location \
0 (-152.859052, 51.218273)
0 (120.585289, 31.298974)
0 (120.585289, 31.298974)
0 (120.585289, 31.298974)
0 (121.4737021, 31.2303904)
destination gc_distance
0 (-122.057005, 37.606922)
1 (-122.057005, 37.606922)
2 (-122.057005, 37.606922)
3 (-122.057005, 37.606922)
4 (-122.057005, 37.606922)
然而,当我在for循环之外再次打印时,gc_distance列只有空白的值! :(
这是为什么???没有编译或运行时错误......所有其他输出看起来都很好,为什么这个计算字段不存在,即使我在for循环中打印它确实有价值? (但在外面换循环它不再了)
答案 0 :(得分:1)
尝试使用此方法:
import pandas as pd
import numpy as np
import math
def dist(i):
diff = list(map(lambda a,b: a-b, df['a'][i], df['b'][i]))
squared = [(k)**2 for k in diff]
squared_diff = sum(squared)
root = math.sqrt(squared_diff)
return root
df = pd.DataFrame([[0, 0, 5, 6, '', '', ''], [2, 6, -5, 8, '', '', '']], columns = ["x_a", "y_a", "x_b", "y_b", "a", "b", "dist"])
print(df)
#data_all['ping_location'] = list(zip(data_all.longitude_evnt, data_all.lattitude_evnt))
df['a'] = list(zip(df.x_a, df.y_a))
df['b'] = list(zip(df.x_b, df.y_b))
print(df)
for i in range(0, len(df)):
df['dist'][i] = dist(i)
print(dist(i))
print(df)
这是我的终端输出:
x_a y_a x_b y_b a b dist
0 0 0 5 6
1 2 6 -5 8
x_a y_a x_b y_b a b dist
0 0 0 5 6 (0, 0) (5, 6)
1 2 6 -5 8 (2, 6) (-5, 8)
test.py:24: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df['dist'][i] = dist(i)
7.810249675906654
7.280109889280518
x_a y_a x_b y_b a b dist
0 0 0 5 6 (0, 0) (5, 6) 7.81025
1 2 6 -5 8 (2, 6) (-5, 8) 7.28011