我不完全理解三角不等式如何用于优化KNN分类中的距离计算。
我编写了一个python脚本,参考下面提到的步骤
Python脚本:
def get_distance(p1 = (0, 0), p2 = (0, 0)):
return abs(p1[0] - p2[0]) + abs(p1[1] - p2[1])
def algorithm(train_set, new_point):
d_n = get_distance(new_point, train_set[0])
d_p = get_distance(new_point, train_set[1])
min_index = 0
if d_p < d_n:
d_n = d_p
min_index = 1
for c in range(2, len(train_set)):
dcp = get_distance(train_set[min_index], train_set[c])
if d_p - d_n < dcp < d_p + d_n:
d_p = get_distance(new_point, train_set[c])
if d_p < d_n:
d_n = d_p
min_index = c
print(train_set[min_index], d_n)
train_set = [
(0, 1, 'A'),
(1, 1, 'A'),
(2, 5, 'B'),
(1, 8, 'A'),
(5, 3, 'C'),
(4, 2, 'C'),
(3, 2, 'A'),
(1, 7, 'B'),
(4, 8, 'B'),
(4, 0, 'A'),
]
for new_point in train_set:
# Checking the distances from the points within training set iteself: min distance = 0, used for validation
result_point = min(train_set, key = lambda x : get_distance(x, new_point))
print(result_point, get_distance(result_point, new_point))
algorithm(train_set, new_point)
print('----------')
但它没有给出1点所需的结果。
我对优化的理解是错误的吗?
提前感谢您的帮助。