如何使用LearningRateScheduler选择最佳的学习率和优化器

时间:2020-11-04 02:21:01

标签: python tensorflow machine-learning

我从Coursera课程中了解到LearningRateScheduler,但是以同样的方式复制它会导致较差的模型性能。也许是由于我设定的范围。 Keras网站上的说明有限。


def duo_LSTM_model(X_train, y_train, X_test,y_test,num_classes,batch_size=68,units=128, learning_rate=0.005, epochs=20, dropout=0.2, recurrent_dropout=0.2 ):

    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Masking(mask_value=0.0, input_shape=(X_train.shape[1], X_train.shape[2])))
    model.add(tf.keras.layers.Bidirectional(LSTM(units, dropout=dropout, recurrent_dropout=recurrent_dropout,return_sequences=True)))
    model.add(tf.keras.layers.Bidirectional(LSTM(units, dropout=dropout, recurrent_dropout=recurrent_dropout)))
    model.add(Dense(num_classes, activation='softmax'))

    adamopt = tf.keras.optimizers.Adam(lr=learning_rate, beta_1=0.9, beta_2=0.999, epsilon=1e-8)
    RMSopt = tf.keras.optimizers.RMSprop(lr=learning_rate, rho=0.9, epsilon=1e-6)
    SGDopt = tf.keras.optimizers.SGD(lr=learning_rate, momentum=0.9, decay=0.1, nesterov=False)


    lr_schedule = tf.keras.callbacks.LearningRateScheduler(
    lambda epoch: 1e-8 * 10**(epoch / 20))

    model.compile(loss='binary_crossentropy',
                  optimizer=adamopt,
                  metrics=['accuracy'])

    history = model.fit(X_train, y_train,
                        batch_size=batch_size,
                        epochs=epochs,
                        validation_data=(X_test, y_test),
                        verbose=1,
                        callbacks=[lr_schedule])

    score, acc = model.evaluate(X_test, y_test,
                                batch_size=batch_size)

    yhat = model.predict(X_test)

    return history, that

我有两个问题。

  1. 1e-8 * 10**(epoch / 20)如何工作?

  2. 我们应该如何为3种不同的优化器选择范围?

1 个答案:

答案 0 :(得分:1)

在回答帖子中的两个问题之前,我们首先要澄清var json = `{ "age":"84", "measurements":"5235", "sensordatavalues": [ {"value_type":"P1", "value":"5.50"}, {"value_type":"P2", "value":"1.65"}, {"value_type":"temperature", "value":"18.21"}, {"value_type":"humidity", "value":"66.75"}, {"value_type":"pressure", "value":"101171.75"} ] }`; // Converting JSON object to JS object var obj = JSON.parse(json); // Define recursive function to print nested values function printValues(obj) { for (var k in obj) { if (obj[k] instanceof Object) { printValues(obj[k]); } else { document.write(obj[k] + "<br>"); }; } }; // Print all the values from the resulting object printValues(obj); document.write("<hr>"); // Print some of the individual values document.write(obj.age + "<br>"); // Prints: Age document.write(obj.measurements + "<br>"); // Prints: Measurements document.write(obj.sensordatavalues.P1 + "<br>"); // Should print P1 value document.write(obj["sensordatavalues"].P2 + "<br>"); // Should print P2 value document.write(obj["sensordatavalues"]["humidity"] + "<br>"); // Should print Humidity document.write(obj.pressure + "<br>"); // Should print Pressure不是为了选择“最佳”学习率。

It is an alternative to using a fixed learning rate is to instead vary the learning rate over the training process.

我认为您真正想问的是“如何确定最佳的初始学习率”。如果我是对的,那么您需要了解有关超参数调整的信息。


第一季度答案:

为了回答LearningRateScheduler的工作方式,让我们创建一个简单的回归任务

1e-8 * 10**(epoch / 20)

在上面的脚本中,我没有使用import tensorflow as tf import tensorflow.keras.backend as K from sklearn.model_selection import train_test_split from tensorflow.keras.models import Model from tensorflow.keras.layers import Input,Dense x = np.linspace(0,100,1000) y = np.sin(x) + x**2 x_train,x_val,y_train,y_val = train_test_split(x,y,test_size=0.3) input_x = Input(shape=(1,)) y = Dense(10,activation='relu')(input_x) y = Dense(1,activation='relu')(y) model = Model(inputs=input_x,outputs=y) adamopt = tf.keras.optimizers.Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=1e-8) def schedule_func(epoch): print() print('calling lr_scheduler on epoch %i' % epoch) print('current learning rate %.8f' % K.eval(model.optimizer.lr)) print('returned value %.8f' % (1e-8 * 10**(epoch / 20))) return 1e-8 * 10**(epoch / 20) lr_schedule = tf.keras.callbacks.LearningRateScheduler(schedule_func) model.compile(loss='mse',optimizer=adamopt,metrics=['mae']) history = model.fit(x_train,y_train, batch_size=8, epochs=10, validation_data=(x_val, y_val), verbose=1, callbacks=[lr_schedule]) 函数,而是编写了函数lambda。运行脚本,您将看到schedule_func仅为每个1e-8 * 10**(epoch / 20)设置了学习率,并且学习率正在增加。

第二季度答案:

例如,有很多不错的帖子

  1. Setting the learning rate of your neural network.
  2. Choosing a learning rate