我是python的新手,我的代码需要一些帮助。我正在读一本。我的Jupyter笔记本使用pyhton2.7编写了arff文件。我想知道我需要在arff.lodarff中输入哪个参数,或者执行另一种方法,因此可以忽略数据头。
rain,meta = arff.loadarff(open('train.arff', 'r'))
读取文件后,我正在做一些数学运算,但出现此错误。
我希望有人能帮助我找出答案。
train,meta = arff.loadarff(open('train.arff', 'r'))
train = pd.DataFrame(train)
print(train)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-192-3b2868d1fd43> in <module>()
----> 1 ne = getNeighbors(X_train, y_train, X_test, k = 3)
2 print(ne)
<ipython-input-191-75b4da86d04e> in getNeighbors(X_train, y_train, X_test, k)
6 for (trainpoint,y_train_label) in zip(X_train,y_train):
7 # calculate the distance and append it to a distances_label with the associated label.
----> 8 distances_label.append((distance(testpoint, trainpoint), y_train_label))
9 k_neighbors_with_labels += [sorted(distances_label)[0:k]] # sort the distances and taken the first k neighbors
10 return k_neighbors_with_labels
<ipython-input-186-22e861402349> in distance(testpoint, trainpoint)
2 def distance(testpoint, trainpoint):
3 # distance between testpoint and trainpoint.
----> 4 dist = np.sqrt(np.sum(np.power(float(testpoint)-float(trainpoint), 2)))
5 return dis
6
ValueError: could not convert string to float: sepal_length
答案 0 :(得分:0)
您假设testpoint
是距离函数中的一个数组。
但是如果不是这样的话
您正在使用pandas数据框,它们不只是数组,这就是为什么要获取列名的原因。