如何确保n个课程中至少有2个包含在训练数据中

时间:2018-12-13 01:28:25

标签: python python-3.x tensorflow keras

我目前正在训练CNN。我使用的指标之一是AUC。我注意到的一个问题是,有时我的生成器只会从一个类中选择示例(我在该项目中有3个类)。因此,如果我的批次大小为20,则有时会在1类中随机从类1中选择20个示例。如果发生这种情况,那么我会收到一条错误消息,指出无法仅使用一门课程来计算AUC,然后训练结束。

是否有一种方法可以在生成器中建立一个条件,使您至少需要n个类中的2个来声明状态?无需使用tf.metrics.auc

谢谢

# load training data
def load_train_data_batch_generator(batch_size=32, rows_in=48, cols_in=48, zs_in=32, 
                                    channels_in=2, num_classes=3, 
                                    dir_dict=dir_dict):

    # dir_in_train = main_dir + '/test_CT_PET_combo'

    # required when using hyperopt
    batch_size = int(batch_size)
    # if not: TypeError: 'float' object cannot be interpreted as an integer

    fnames = os.listdir(dir_dict['dir_in_train_combo'])

    y_train = np.zeros((batch_size, num_classes))
    x_train = np.zeros((batch_size, rows_in, cols_in, zs_in, channels_in))

    while True:
        count = 0
        for fname in np.random.choice(fnames, batch_size, replace=False):

            data_label = scipy.io.loadmat(os.path.join(dir_dict['dir_out_train'], fname))['output']

            # changing one hot encoding to integer
            integer_label = np.argmax(data_label[0], axis=0)
            y_train[count,:] = data_label

            # Loading train ct w/ c and pet/ct combo 
            train_combo = scipy.io.loadmat(os.path.join(dir_dict['dir_in_train_combo'], fname))[fname]
            x_train[count,:,:,:,:] = train_combo

            count += 1

        yield(x_train, y_train)

每个请求:指标和错误代码 指标代码

def sk_auroc(y_true, y_pred):
    import tensorflow as tf
    from sklearn.metrics import roc_auc_score
    return tf.py_func(roc_auc_score, (y_true, y_pred), tf.double)


Epoch 1/200
 57/205 [=======>......................] - ETA: 11s - loss: 1.2858 - acc: 0.3632 - sk_auroc: 0.4581 - auc: 0.5380ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.
Traceback (most recent call last):

  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 158, in __call__
    ret = func(*args)

  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 277, in roc_auc_score
    sample_weight=sample_weight)

  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/sklearn/metrics/base.py", line 118, in _average_binary_score
    sample_weight=score_weight)

  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 268, in _binary_roc_auc_score
    raise ValueError("Only one class present in y_true. ROC AUC score "

ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.


     [[Node: metrics_1/sk_auroc/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT], Tout=[DT_DOUBLE], token="pyfunc_24", _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_predictions_target_1_0_1, predictions_1/Softmax/_857)]]
Traceback (most recent call last):
  File "<ipython-input-48-34101247f335>", line 8, in optimize_cnn
    model, results = train_model(space)
  File "<ipython-input-47-254bd056a344>", line 40, in train_model
    validation_steps=round(len(os.listdir(dir_out_val))/space['batch_size'])
  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
    class_weight=class_weight)
  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1217, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1454, in __call__
    self._session._session, self._handle, args, status, None)
  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.
Traceback (most recent call last):

  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 158, in __call__
    ret = func(*args)

  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 277, in roc_auc_score
    sample_weight=sample_weight)

  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/sklearn/metrics/base.py", line 118, in _average_binary_score
    sample_weight=score_weight)

  File "/home/mikedoho/anaconda3/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 268, in _binary_roc_auc_score
    raise ValueError("Only one class present in y_true. ROC AUC score "

ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.


     [[Node: metrics_1/sk_auroc/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT], Tout=[DT_DOUBLE], token="pyfunc_24", _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_predictions_target_1_0_1, predictions_1/Softmax/_857)]]

tf.metrics.auc代码和显示我不喜欢它的原因的图片

# converting tf metric in keras metric
def as_keras_metric(method):
    import functools
    from keras import backend as K
    import tensorflow as tf
    @functools.wraps(method)
    def wrapper(self, args, **kwargs):
        """ Wrapper for turning tensorflow metrics into keras metrics """
        value, update_op = method(self, args, **kwargs)
        K.get_session().run(tf.local_variables_initializer())
        with tf.control_dependencies([update_op]):
            value = tf.identity(value)
        return value
    return wrapper

tf_auc_roc = as_keras_metric(tf.metrics.auc)

好像tf.metrics.auc太平滑了,可能有些地方掉了,以后我需要研究

looking at different metrics for accuracy

1 个答案:

答案 0 :(得分:1)

您可以在tensorflow中使用tf.metrics.auc而不是sklearns中的sklearn.metrics.roc_auc_score。例如:

import tensorflow as tf
label = tf.Variable([1,0,0,0,1])
pred = tf.Variable([0.8,1,0.6,0.23,0.78])
auc,op = tf.metrics.auc(label,pred)

with tf.Session()as sess:
    init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
    sess.run(init)
    for i in range(3):
        auc_value, op_value = sess.run([auc,op])
        print(auc_value)
0.0
0.6666667
0.66666657

您没有问题。