使用来自Keras模型的多类输出的自定义评分为cross_val_score或GridSearchCV返回相同的错误,如下所示(它在Iris上,因此您可以直接运行它来测试):
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical
from keras.wrappers.scikit_learn import KerasClassifier
iris = datasets.load_iris()
X= iris.data
Y = to_categorical(iris.target)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=1000)
def create_model(optimizer='rmsprop'):
model = Sequential()
model.add(Dense(8,activation='relu',input_shape = (4,)))
model.add(Dense(3,activation='softmax'))
model.compile(optimizer = optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model,
epochs=10,
batch_size=5,
verbose=0)
#results = cross_val_score(model, X_train, Y_train, scoring='precision_macro')
param_grid = {'optimizer':('rmsprop','adam')}
grid = GridSearchCV(model,
param_grid=param_grid,
return_train_score=True,
scoring=['accuracy','precision_macro','recall_macro'],
refit='precision_macro')
grid_results = grid.fit(X_train,Y_train)
所以我收到此错误
我绕过了整个堆栈,因为你可以通过复制上面的代码来重现它。
ValueError: Classification metrics can't handle a mix of multilabel-indicator and binary targets
当我删除评分参数时,它可以正常工作。
有没有办法避免这种情况并启用f1,精度或任何自定义分数?当然,无需重写我自己的网格搜索代码。
感谢您的帮助
更新:我刚刚找到了解决方法
首先,这个doc(http://scikit-learn.org/stable/modules/multiclass.html#multilabel-classification-format)表明Keras中使用的单热表示在scikit-learn中被解释为 multilabel 。
然后查看实现KerasClassifier类的scikit_learn.py
:https://github.com/keras-team/keras/blob/master/keras/wrappers/scikit_learn.py
BaseWrapper类中的fit函数包含以下代码行:
if loss_name == 'categorical_crossentropy' and len(y.shape) != 2:
y = to_categorical(y)
Wrapper自己进行分类转换。
为了避免这个问题,Keras似乎由于多类表示与scikit-learn的区别,可以采用scikit-learn风格的多类[0,1,2,1,0,2]
并将其转换为仅用于NN模型的分类表示适合。
因此,我只是尝试在将模型传递给sklearn函数时删除分类转换。
现在可以使用
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.utils import to_categorical
from keras.wrappers.scikit_learn import KerasClassifier
iris = datasets.load_iris()
X= iris.data
#Y = to_categorical(iris.target,3)
Y = iris.target
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=1000)
def create_model(optimizer='rmsprop'):
model = Sequential()
model.add(Dense(8,activation='relu',input_shape = (4,)))
model.add(Dense(3,activation='softmax'))
model.compile(optimizer = optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model,
epochs=10,
batch_size=5,
verbose=0)
#results = cross_val_score(model, X_train, Y_train, scoring='precision_macro')
param_grid = {'optimizer':('rmsprop','adam')}
grid = GridSearchCV(model,
param_grid=param_grid,
return_train_score=True,
scoring=['precision_macro','recall_macro','f1_macro'],
refit='precision_macro')
grid_results = grid.fit(X_train,Y_train)