我的目标是返回特定半径内点的总数和平均距离。使用下面,半径的中心是 2,中心由 X2, Y2
确定。
总会有一个点与 X2, Y2
位于相同的位置。我希望从分析中忽略这一点。
注意:我希望该函数能够及时处理多个点。
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'Time' : [1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3],
'Item' : ['A','B','C','D','E','F','A','B','C','D','E','F','A','B','C','D'],
'x' : [4,5,8,3,6,2,6,4,3.5,2,4,6,6,2,4,4],
'y' : [-2,0,-2,0,0,4,-1,-2,-2,4,-3,2,-2,0,-2.5,4],
'X2' : [4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4],
'Y2' : [-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2,-2],
})
sq_dist = (df['X2'] - df['x']) ** 2 + (df['Y2'] - df['y']) ** 2
# count of points within radius
count = ((sq_dist <= 2 ** 2).astype(int)
.groupby([df['Time']])
.sum()
.reset_index()
.fillna(0)
)
# avg_distance between points within radius
df['dist'] = np.sqrt((df['X2'] - df['x']) ** 2 + (df['Y2'] - df['y']) ** 2)
inside = df[sq_dist <= 2 ** 2].copy()
avg_dist = (inside.groupby(['Time'])['dist']
.mean()
.reset_index()
.fillna(0)
)
如果我合并 count
和 avg_dist
,输出应该是:
Time count dist
0 1 0 0.0
1 2 2 0.75
2 3 1 0.5
答案 0 :(得分:6)
你可以试试:
# distance to reference point
dist = np.square(df[['x_ref','y_ref']] - df[['x','y']].values).sum(1) ** 0.5
(dist[dist.le(2)&dist.gt(0)] # filter the valid points
.groupby(df['Time']) # groupby Time
.agg(['mean', 'count']) # count and mean
.reindex(df.Time.unique(), fill_value=0)
)
输出:
mean count
Time
1 0.0 0
2 1.0 1