skle in sklearn - 如何在课堂上适当使用KNeighborsRegressor?

时间:2016-07-17 19:52:37

标签: scikit-learn knn

我写了两个版本的K最近邻模型。两者的区别仅在于数据在第二部分被分成训练和测试集,但所有数据都在第一部分中使用(没有训练/测试)。两者都使用与下面相同的数据集的一部分。

功能数据(X)

[[33.499313000000001, 33.499313000000001], [43.238892999999997, 43.238892999999997], [43.252267000000003, 43.252267000000003], [43.251044999999998, 43.251044999999998], [43.2408748, 43.2408748], [42.9685074, 42.9685074], [43.030356099999999, 43.030356099999999], [43.014093000000003, 43.014093000000003], [43.017124000000003, 43.017124000000003], [43.017701000000002, 43.017701000000002], [43.015931000000002, 43.015931000000002], [43.013155699999999, 43.013155699999999], [43.014164000000001, 43.014164000000001], [43.017938700000002, 43.017938700000002], [43.093265000000002, 43.093265000000002], [43.090642000000003, 43.090642000000003], [43.0910607, 43.0910607], [43.110157100000002, 43.110157100000002], [43.077415000000002, 43.077415000000002], [43.096271000000002, 43.096271000000002], [43.103071900000003, 43.103071900000003], [43.100384099999999, 43.100384099999999], [43.0954975, 43.0954975], [43.092902899999999, 43.092902899999999], [43.091816000000001, 43.091816000000001], [43.096359, 43.096359], [43.107227000000002, 43.107227000000002], [43.101459800000001, 43.101459800000001], [43.075345267735997, 43.075345267735997], [43.103663300000001, 43.103663300000001], [43.100808000000001, 43.100808000000001], [43.090563099999997, 43.090563099999997], [43.090455900000002, 43.090455900000002], [43.095485500000002, 43.095485500000002], [43.103427000000003, 43.103427000000003], [43.090653000000003, 43.090653000000003], [43.082611, 43.082611], [43.0901268, 43.0901268], [43.095695999999997, 43.095695999999997], [43.095552599999998, 43.095552599999998], [43.087887000000002, 43.087887000000002], [43.108379900000003, 43.108379900000003], [43.106097200000001, 43.106097200000001], [43.092882000000003, 43.092882000000003], [43.095547199999999, 43.095547199999999], [43.099933499999999, 43.099933499999999], [43.092684599999998, 43.092684599999998], [43.107769300000001, 43.107769300000001], [43.096947399999998, 43.096947399999998], [43.094959000000003, 43.094959000000003], [43.104534999999998, 43.104534999999998], [43.099418399999998, 43.099418399999998], [43.095357, 43.095357], [43.097688300000002, 43.097688300000002], [43.057022699999997, 43.057022699999997], [43.092902899999999, 43.092902899999999], [43.095723999999997, 43.095723999999997], [43.075383000000002, 43.075383000000002], [43.057089900000001, 43.057089900000001], [43.084459600000002, 43.084459600000002]]

响应数据(y)

[3.5, 4.0, 4.0, 4.5, 4.0, 1.5, 2.0, 2.5, 4.5, 3.5, 5.0, 1.5, 3.5, 3.5, 4.0, 3.0, 3.0, 3.5, 4.5, 4.5, 5.0, 3.0, 3.0, 4.0, 3.5, 4.0, 3.5, 3.5, 3.0, 3.0, 5.0, 3.0, 3.0, 2.5, 3.5, 4.0, 5.0, 3.5, 2.5, 4.0, 2.5, 3.5, 3.5, 4.0, 1.5, 4.0, 4.5, 5.0, 4.0, 3.5, 2.0, 5.0, 5.0, 4.0, 3.5, 3.5, 4.0, 3.0, 3.0, 3.0]

导入模块

from sklearn.base import BaseEstimator, RegressorMixin
from sklearn.neighbors import KNeighborsRegressor
from sklearn.cross_validation import KFold
from sklearn.cross_validation import train_test_split

第一个模型 - 没有拆分

class LonLatClassifier(BaseEstimator, RegressorMixin):
def __init__(self):
    pass

def fit(self, X, y):
    self.knn = KNeighborsRegressor(n_neighbors = 9)
    self.knn.fit(X, y)
    return self        

def predict(self, X):
    return self.knn.predict(X)

当我使用以下代码测试此模型时,它给出了预测分数。

C = LonLatClassifier()
print(C.score(X, y))

第二个模型 - 分裂

class LonLatClassifier(BaseEstimator, RegressorMixin):
def __init__(self, X_train, X_test, y_train, y_test):
    self.xtrain = X_train
    self.ytrain = y_train
    self.xtest = X_test
    self.ytest = y_test
    pass

def fit(self, X_train, y_train):
    self.knn = KNeighborsRegressor(n_neighbors = 13)
    self.fit = self.knn.fit(self.xtrain, self.ytrain)
    return self

def predict(self, X_test):
    self.xtest = X_test
    return self.knn.predict(self.xtest)

在这个模型中,当我运行以下代码时,它不会给我预测或得分。

C = LonLatClassifier(X_train, X_test, y_train, y_test)
C.predict(X_test)

错误消息是:

AttributeError                            Traceback (most recent call last)
<ipython-input-52-ecf778e90264> in <module>()
----> 1 C.predict(X_test)

<ipython-input-47-aeee765d2615> in predict(self, X_test)
 14     def predict(self, X_test):
 15         self.xtest = X_test
---> 16         return self.knn.predict(self.xtest)
 17 

AttributeError: 'LonLatClassifier' object has no attribute 'knn'

我不清楚它出错的地方,因为这两个模型在数据分割方面只有不同。有人可以帮我找出问题并给我一个建议如何修复它?非常感谢你。

1 个答案:

答案 0 :(得分:0)

我很抱歉这个混乱。这只是因为我在第二个模型中调用C.predict()之前没有调用C.fit()。

我不确定此信息是否对stackoverflow中的更多受众有用。请让我知道,我可以把它拿下来。谢谢。