我写了两个版本的K最近邻模型。两者的区别仅在于数据在第二部分被分成训练和测试集,但所有数据都在第一部分中使用(没有训练/测试)。两者都使用与下面相同的数据集的一部分。
功能数据(X)
[[33.499313000000001, 33.499313000000001], [43.238892999999997, 43.238892999999997], [43.252267000000003, 43.252267000000003], [43.251044999999998, 43.251044999999998], [43.2408748, 43.2408748], [42.9685074, 42.9685074], [43.030356099999999, 43.030356099999999], [43.014093000000003, 43.014093000000003], [43.017124000000003, 43.017124000000003], [43.017701000000002, 43.017701000000002], [43.015931000000002, 43.015931000000002], [43.013155699999999, 43.013155699999999], [43.014164000000001, 43.014164000000001], [43.017938700000002, 43.017938700000002], [43.093265000000002, 43.093265000000002], [43.090642000000003, 43.090642000000003], [43.0910607, 43.0910607], [43.110157100000002, 43.110157100000002], [43.077415000000002, 43.077415000000002], [43.096271000000002, 43.096271000000002], [43.103071900000003, 43.103071900000003], [43.100384099999999, 43.100384099999999], [43.0954975, 43.0954975], [43.092902899999999, 43.092902899999999], [43.091816000000001, 43.091816000000001], [43.096359, 43.096359], [43.107227000000002, 43.107227000000002], [43.101459800000001, 43.101459800000001], [43.075345267735997, 43.075345267735997], [43.103663300000001, 43.103663300000001], [43.100808000000001, 43.100808000000001], [43.090563099999997, 43.090563099999997], [43.090455900000002, 43.090455900000002], [43.095485500000002, 43.095485500000002], [43.103427000000003, 43.103427000000003], [43.090653000000003, 43.090653000000003], [43.082611, 43.082611], [43.0901268, 43.0901268], [43.095695999999997, 43.095695999999997], [43.095552599999998, 43.095552599999998], [43.087887000000002, 43.087887000000002], [43.108379900000003, 43.108379900000003], [43.106097200000001, 43.106097200000001], [43.092882000000003, 43.092882000000003], [43.095547199999999, 43.095547199999999], [43.099933499999999, 43.099933499999999], [43.092684599999998, 43.092684599999998], [43.107769300000001, 43.107769300000001], [43.096947399999998, 43.096947399999998], [43.094959000000003, 43.094959000000003], [43.104534999999998, 43.104534999999998], [43.099418399999998, 43.099418399999998], [43.095357, 43.095357], [43.097688300000002, 43.097688300000002], [43.057022699999997, 43.057022699999997], [43.092902899999999, 43.092902899999999], [43.095723999999997, 43.095723999999997], [43.075383000000002, 43.075383000000002], [43.057089900000001, 43.057089900000001], [43.084459600000002, 43.084459600000002]]
响应数据(y)
[3.5, 4.0, 4.0, 4.5, 4.0, 1.5, 2.0, 2.5, 4.5, 3.5, 5.0, 1.5, 3.5, 3.5, 4.0, 3.0, 3.0, 3.5, 4.5, 4.5, 5.0, 3.0, 3.0, 4.0, 3.5, 4.0, 3.5, 3.5, 3.0, 3.0, 5.0, 3.0, 3.0, 2.5, 3.5, 4.0, 5.0, 3.5, 2.5, 4.0, 2.5, 3.5, 3.5, 4.0, 1.5, 4.0, 4.5, 5.0, 4.0, 3.5, 2.0, 5.0, 5.0, 4.0, 3.5, 3.5, 4.0, 3.0, 3.0, 3.0]
导入模块
from sklearn.base import BaseEstimator, RegressorMixin
from sklearn.neighbors import KNeighborsRegressor
from sklearn.cross_validation import KFold
from sklearn.cross_validation import train_test_split
第一个模型 - 没有拆分
class LonLatClassifier(BaseEstimator, RegressorMixin):
def __init__(self):
pass
def fit(self, X, y):
self.knn = KNeighborsRegressor(n_neighbors = 9)
self.knn.fit(X, y)
return self
def predict(self, X):
return self.knn.predict(X)
当我使用以下代码测试此模型时,它给出了预测分数。
C = LonLatClassifier()
print(C.score(X, y))
第二个模型 - 分裂
class LonLatClassifier(BaseEstimator, RegressorMixin):
def __init__(self, X_train, X_test, y_train, y_test):
self.xtrain = X_train
self.ytrain = y_train
self.xtest = X_test
self.ytest = y_test
pass
def fit(self, X_train, y_train):
self.knn = KNeighborsRegressor(n_neighbors = 13)
self.fit = self.knn.fit(self.xtrain, self.ytrain)
return self
def predict(self, X_test):
self.xtest = X_test
return self.knn.predict(self.xtest)
在这个模型中,当我运行以下代码时,它不会给我预测或得分。
C = LonLatClassifier(X_train, X_test, y_train, y_test)
C.predict(X_test)
错误消息是:
AttributeError Traceback (most recent call last)
<ipython-input-52-ecf778e90264> in <module>()
----> 1 C.predict(X_test)
<ipython-input-47-aeee765d2615> in predict(self, X_test)
14 def predict(self, X_test):
15 self.xtest = X_test
---> 16 return self.knn.predict(self.xtest)
17
AttributeError: 'LonLatClassifier' object has no attribute 'knn'
我不清楚它出错的地方,因为这两个模型在数据分割方面只有不同。有人可以帮我找出问题并给我一个建议如何修复它?非常感谢你。
答案 0 :(得分:0)
我很抱歉这个混乱。这只是因为我在第二个模型中调用C.predict()之前没有调用C.fit()。
我不确定此信息是否对stackoverflow中的更多受众有用。请让我知道,我可以把它拿下来。谢谢。