有没有办法访问GridSearchCV流程中计算的预测值?
我希望能够根据实际值(来自测试/验证集)绘制预测的y值。
网格搜索完成后,我可以使用
将其与其他一些数据相匹配 ypred = grid.predict(xv)
但我希望能够绘制网格搜索期间计算的值。也许有一种方法可以将这些点保存为pandas数据帧?
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV, KFold,
cross_val_score, train_test_split
from sklearn.pipeline import Pipeline
from sklearn.svm import SVR
scaler = StandardScaler()
svr_rbf = SVR(kernel='rbf')
pipe = Pipeline(steps=[('scaler', scaler), ('svr_rbf', svr_rbf)])
grid = GridSearchCV(pipe, param_grid=parameters, cv=splits, refit=True, verbose=3, scoring=msescorer, n_jobs=4)
grid.fit(xt, yt)
答案 0 :(得分:1)
一种解决方案是制作一个自定义得分手,并将收到的参数保存到全局变量中:
from sklearn.grid_search import GridSearchCV
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error,make_scorer
X, y = np.random.rand(2,200)
clf = SVR()
ys = []
def MSE(y_true,y_pred):
global ys
ys.append(y_pred)
mse = mean_squared_error(y_true, y_pred)
return mse
def scorer():
return make_scorer(MSE, greater_is_better=False)
n_splits = 3
cv = GridSearchCV(clf, {'degree':[1,2,3]}, scoring=scorer(), cv=n_splits)
cv.fit(X.reshape(-1, 1), y)
然后我们需要将每个拆分收集到一个完整的数组中:
idxs = range(0, len(ys)+1, n_splits)
#e.g. [0, 3, 6, 9]
#collect every n_split elements into a single list
new = [ys[j[0]+1:j[1]] for j in zip(idxs,idxs[1:])]
#summing every such list
ys = [reduce(lambda x,y:np.concatenate((x,y), axis=0), i) for i in new]