SciKit-学习网格搜索自己的评分对象语法

时间:2016-11-13 10:16:26

标签: python scikit-learn keras

我想在使用Keras构建的卷积网中搜索超参数。为此,我使用了SciKit中的KerasClassifier和GridSearchCV - 根据这里给出的良好介绍MachineLearningMastery进行学习。 通常SciKit-learn在'accuracy'上进行优化,但是我的网络运行图像分段来优化Jaccard索引。因此,我需要使用make_scorer定义我自己的网格搜索评分对象,如此处make_scorer和此处defining your scoring strategy所述。下面的代码部分显示了我的实现,但我在model.compile(optimizer=optimizer, loss=eval_loss, metrics=(['eval_func'])中收到错误,我不知道在指标中指定了什么。默认值为'accuracy',但我认为在我的情况下,这将是'eval_func'(在不进行网格搜索时有效)或'score'但在这种情况下这些都不起作用。

什么是正确的语法?

def eval_func(y_true, y_pred):
    '''Evaluation function dice or jaccard, set with global var JACCARD=True'''
    if JACCARD:
        return jaccard_index(y_true, y_pred)
    else:
        return dice_coef(y_true, y_pred)


def get_unet(batch_size=32, decay=0, dropout_rate=0.5, weight_constraint=0):
    '''Create u-net model'''
    dim = 32    

    inputs = Input((3, image_cols, image_rows)) # modified to take 3 color channel input
    conv1 = Convolution2D(dim, 3, 3, activation='relu', border_mode='same', W_constraint=weight_constraint)(inputs)
    conv1 = Convolution2D(dim, 3, 3, activation='relu', border_mode='same', W_constraint=weight_constraint)(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    pool1 = Dropout(dropout_rate)(pool1) # dropout added to all layers

    ... more layers ...

    conv10 = Convolution2D(1, 1, 1, activation='sigmoid')(conv9)

    model = Model(input=inputs, output=conv10)

    optimizer = Adam(lr=LR, decay=decay)   
    model.compile(optimizer=optimizer, loss=eval_loss, metrics=(['eval_func'])

    return model

def run_grid_search():
    '''Optimize model parameters with grid search'''

    ... loading data ...

    model = KerasClassifier(build_fn=get_unet, verbose=1, nb_epoch=NUM_EPOCH, shuffle=True)
    # define grid search parameters
    batch_size = [16, 32, 48]
    decay = [0, 0.002, 0.004]
    param_grid = dict(batch_size=batch_size, decay=decay)

    # create scoring object
    score = make_scorer(eval_func, greater_is_better=True)

    grid = GridSearchCV(estimator=model, param_grid=param_grid, scoring=score, n_jobs=1, verbose=1)
    grid_result = grid.fit(X_aug, Y_aug) 

以下是我使用'eval_func'和'得分'得到错误的最后一部分:

  

文件“C:\ Program   文件\ Anaconda2 \ lib \ site-packages \ keras \ metrics.py“,第216行,在get中       return get_from_module(identifier,globals(),'metric')文件“C:\ Program   Files \ Anaconda2 \ lib \ site-packages \ keras \ utils \ generic_utils.p y“,line   16,在get_from_module中       str(identifier))异常:无效的度量标准:eval_func

1 个答案:

答案 0 :(得分:1)

传递给编译器时,应取消报价。仅当Keras API是Keras API的一部分时,Keras才能将其识别为引号。参见Keras.io/metrics

这是问题所在

model.compile(optimizer=optimizer, loss=eval_loss, metrics=(['eval_func']) 

您应将其修复为:

model.compile(optimizer=optimizer, loss=eval_loss, metrics=([eval_func])

希望有帮助!