机器学习的问题(拟合模型)

时间:2018-03-23 18:40:27

标签: python machine-learning scikit-learn

我正在使用anaconda-navigator - > python3.6

当我运行以下代码时,我收到此错误:

from sklearn.metrics import make_scorer
from sklearn.tree import DecisionTreeRegressor
from sklearn.grid_search import GridSearchCV

def fit_model(X, y):
    """ Performs grid search over the 'max_depth' parameter for a 
        decision tree regressor trained on the input data [X, y]. """

    # Create cross-validation sets from the training data
    # sklearn version 0.18: ShuffleSplit(n_splits=10, test_size=0.1, train_size=None, random_state=None)
    # sklearn versiin 0.17: ShuffleSplit(n, n_iter=10, test_size=0.1, train_size=None, random_state=None)
    cv_sets = ShuffleSplit(X.shape[0], n_iter = 10, test_size = 0.20, random_state = 0)

    # TODO: Create a decision tree regressor object
    regressor = DecisionTreeRegressor()

    # TODO: Create a dictionary for the parameter 'max_depth' with a range from 1 to 10
    params = {'max_depth':range(1,10)}

    # TODO: Transform 'performance_metric' into a scoring function using 'make_scorer' 
    scoring_fnc = make_scorer(performance_metric)

    # TODO: Create the grid search cv object --> GridSearchCV()
    # Make sure to include the right parameters in the object:
    # (estimator, param_grid, scoring, cv) which have values 'regressor', 'params', 'scoring_fnc', and 'cv_sets' respectively.

    grid = GridSearchCV(regressor, params, scoring_fnc, cv=cv_sets)

    # Fit the grid search object to the data to compute the optimal model
    grid = grid.fit(X, y)

    # Return the optimal model after fitting the data
    return grid.best_estimator_`



# Fit the training data to the model using grid search
reg = fit_model(X_train, y_train)

# Produce the value for 'max_depth'
print ("Parameter 'max_depth' is {} for the optimal model.".format(reg.get_params()['max_depth']))

以下是错误消息:

ValueError                                Traceback (most recent call last)
<ipython-input-12-05857a84a7c5> in <module>()
      1 # Fit the training data to the model using grid search
----> 2 reg = fit_model(X_train, y_train)
      3 
      4 # Produce the value for 'max_depth'
      5 print ("Parameter 'max_depth' is {} for the optimal model.".format(reg.get_params()['max_depth']))

<ipython-input-11-2c0c19498236> in fit_model(X, y)
     26     # (estimator, param_grid, scoring, cv) which have values 'regressor', 'params', 'scoring_fnc', and 'cv_sets' respectively.
     27 
---> 28     grid = GridSearchCV(regressor, params, scoring_fnc, cv=cv_sets)
     29 
     30     # Fit the grid search object to the data to compute the optimal model

~/anaconda3/lib/python3.6/site-packages/sklearn/grid_search.py in __init__(self, estimator, param_grid, scoring, fit_params, n_jobs, iid, refit, cv, verbose, pre_dispatch, error_score)
    819             refit, cv, verbose, pre_dispatch, error_score)
    820         self.param_grid = param_grid
--> 821         _check_param_grid(param_grid)
    822 
    823     def fit(self, X, y=None):

~/anaconda3/lib/python3.6/site-packages/sklearn/grid_search.py in _check_param_grid(param_grid)
    349             if True not in check:
    350                 raise ValueError("Parameter values for parameter ({0}) need "
--> 351                                  "to be a sequence.".format(name))
    352 
    353             if len(v) == 0:

ValueError: Parameter values for parameter (max_depth) need to be a sequence.

1 个答案:

答案 0 :(得分:0)

grid_search.py​​会检查:

check = [isinstance(v, k) for k in (list, tuple, np.ndarray)]

好像你不能使用范围。我会试试这个:

params = {'max_depth': np.arange(1,10)}

或没有numpy:

params = {'max_depth': [x for x in range(1,10)]}