我正在尝试创建一个循环,使我可以循环遍历numpy数组和浮点数,特别是ndarray和float64。
我当前的代码是:
def euclidean_distance(a, b):
print (type(a))
print (type(b))
total_distance = 0
for index in range(len(a)):
total_distance = total_distance + ((a[index] - b[index])*(a[index] - b[index]))
total_distance = math.sqrt(total_distance)
return total_distance
我的输出是:
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.float64'>
<class 'numpy.float64'>
Traceback (most recent call last):
File "D:/ML/WiP_KMeans.py", line 289, in <module>
main()
File "D:/ML/WiP_KMeans.py", line 286, in main
k_means(test, 3)
File "D:/ML/WiP_KMeans.py", line 239, in k_means
centroid_error = centroid_error + get_centroid_error(currCent , oldCent)
File "D:/ML/WiP_KMeans.py", line 70, in get_centroid_error
total_error = total_error + euclidean_distance(centroid[index], old_centroid[index])
File "D:/ML/WiP_KMeans.py", line 48, in euclidean_distance
for index in range(len(a)):
TypeError: object of type 'numpy.float64' has no len()
我尝试使用numpy文档中的nditer
的不同变体,但是没有找到一种解决方案可以让我正确地迭代ndarray或float来计算欧式距离。
正常输入的示例可以是a=[0.3, 5.4, 3.2, 11.0] and b=[0.0, 5.0, 31.3, 2.0]
之类的东西。
我在这里提供了一些示例:
[5.9, 3.0, 5.1, 1.8] - [5.1, 3.3, 1.7, 0.5]
[5.9, 3.0, 5.1, 1.8] - [4.8, 3.4, 1.9, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.0, 3.0, 1.6, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.0, 3.4, 1.6, 0.4]
[5.9, 3.0, 5.1, 1.8] - [5.2, 3.5, 1.5, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.2, 3.4, 1.4, 0.2]
[5.9, 3.0, 5.1, 1.8] - [4.7, 3.2, 1.6, 0.2]
[5.9, 3.0, 5.1, 1.8] - [4.8, 3.1, 1.6, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.4, 3.4, 1.5, 0.4]
[5.9, 3.0, 5.1, 1.8] - [5.2, 4.1, 1.5, 0.1]
[5.9, 3.0, 5.1, 1.8] - [4.9, 3.1, 1.5, 0.1]
[5.9, 3.0, 5.1, 1.8] - [5.0, 3.2, 1.2, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.5, 3.5, 1.3, 0.2]
[5.9, 3.0, 5.1, 1.8] - [4.9, 3.1, 1.5, 0.1]
[5.9, 3.0, 5.1, 1.8] - [4.4, 3.0, 1.3, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.1, 3.4, 1.5, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.0, 3.5, 1.3, 0.3]
[5.9, 3.0, 5.1, 1.8] - [4.5, 2.3, 1.3, 0.3]
[5.9, 3.0, 5.1, 1.8] - [4.4, 3.2, 1.3, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.0, 3.5, 1.6, 0.6]
[5.9, 3.0, 5.1, 1.8] - [5.1, 3.8, 1.9, 0.4]
[5.9, 3.0, 5.1, 1.8] - [4.8, 3.0, 1.4, 0.3]
[5.9, 3.0, 5.1, 1.8] - [5.1, 3.8, 1.6, 0.2]
[5.9, 3.0, 5.1, 1.8] - [4.6, 3.2, 1.4, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.3, 3.7, 1.5, 0.2]
[5.9, 3.0, 5.1, 1.8] - [5.0, 3.3, 1.4, 0.2]
[5.9, 3.0, 5.1, 1.8] - [4.9, 2.4, 3.3, 1.0]
[5.9, 3.0, 5.1, 1.8] - [5.0, 2.0, 3.5, 1.0]
[5.9, 3.0, 5.1, 1.8] - [5.0, 2.3, 3.3, 1.0]
[5.9, 3.0, 5.1, 1.8] - [5.1, 2.5, 3.0, 1.1]
[5.488288288288287] - [6.4]
有人可以协助吗?
答案 0 :(得分:3)
此操作可以完全矢量化(无需使用Python进行循环,即可大幅提高性能):
a = np.array([0.3, 5.4, 3.2, 11.0])
b = np.array([0.0, 5.0, 31.3, 2.0])
np.sqrt(np.sum((a - b) ** 2))
但是,NumPy随附电池。有一个功能:
np.linalg.norm(a - b)
两种方法均应具有相似的性能。不过,第二个可能更快。
答案 1 :(得分:0)
这是一个适合您的示例。
import numpy as np
a=np.array([0.3, 5.4, 3.2, 11.0])
b=np.array([0.0, 5.0, 31.3, 2.0])
c=np.array([0.1])
d=np.array([6.2])
def dist(x,y):
return np.sqrt(sum([(x[i]-y[i])**2 for i in range(x.shape[0])]))
print(dist(a,b))
print(dist(c,d))