我正在尝试将k最近邻居的代码从R转换为Python。我在最后几行遇到麻烦,分配分类。我对算法有很高的理解,但不确定一些细节。这就是我到目前为止所做的:
from pandas import Series, DataFrame
import pandas as pd
import numpy as np
import numpy.random as npr
from scipy.spatial.distance import pdist, squareform
#define function
def kNN(X,y,k):
#number of observations
n = len(y)
#Set up return values
y_star = Series(np.zeros(n),dtype='Int64')
#distance vector
dist = squareform(pdist(X,metric='Euclidean'))
#so it can't choose itself as a closest neighbour
np.fill_diagonal(dist,1e10)
#Loop through each observation
for i in range(n):
#Find the y values of the k nearest neighbours
y_nearest = y[dist[i,:].argsort()[:k]]
#assign y_hat
if y_nearest.mean()> 0.5:
y_hat[i] = 1
#return the predicted classification
return y_hat
测试时给出错误:
ValueError: No axis named 1 for object type <class 'pandas.core.series.Series'>