Question

我的数据集如下：

ID | X | Y | Z
--------------
 1 | 5 | 5 | 5
 1 | 4 | 2 | 0
 2 | 1 | 3 | 4
 .
 .
 .

我对每个ID都有基本的真值（x，y，z）。我想使用上表中每个ID的真实值来计算距离。我尝试使用df.groupby()，但不确定如何将df重新粘在一起。

真实值：

ID | X | Y | Z
---------------
 1 | 1 | 2 | 3
 2 | 4 | 5 | 6
 3 | 7 | 8 | 9
 .
 .

我希望输出看起来像：

ID | X  | Y  | Z
-----------------
 1 |  4 |  3 |  2
 1 |  3 |  0 | -3
 2 | -3 | -2 | -2
 .
 .
 .

Answer 1

您可以将ID设置为索引并减去。这样，pandas将为您对齐正确的ID（在本例中为索引）：

df.set_index('ID').sub(ground_truths.set_index('ID')).reset_index()

输出：

   ID    X    Y    Z
0   1  4.0  3.0  2.0
1   1  3.0  0.0 -3.0
2   2 -3.0 -2.0 -2.0
3   3  NaN  NaN  NaN

更新：针对欧几里得：

tmp = df.set_index('ID').sub(ground_truths.set_index('ID'))

# this is Euclidean part:
# you can use other packages, e.g. np.norm
result = ((tmp**2).sum(axis=1))**0.5
result = result.reset_index()

如何分组和计算距离？

1 个答案: