TensorFlow辍学:如何将不同的值应用于Train vs Test?

时间:2017-10-17 18:23:26

标签: python tensorflow

我正在尝试在TensorFlow 中实现 dropout for一个简单的3层神经网络进行分类并遇到问题。更具体地说,我正在尝试在训练与测试时应用不同的丢失参数pkeep值。

我采取的方法如下:

1)def create_placeholders(n_x,n_y):

X = tf.placeholder("float", [n_x, None])
Y = tf.placeholder("float", [n_y, None])   
pkeep = tf.placeholder(tf.float32)
return X,Y,pkeep

2)在函数forward_propagation(X,parameters,pkeep)中,执行以下操作:

Z1 = tf.add(tf.matmul(W1, X), b1)
A1 = tf.nn.relu(Z1)
A1d = tf.nn.dropout(A1, pkeep)
Z2 = tf.add(tf.matmul(W2, A1d),b2)
A2 = tf.nn.relu(Z2)
A2d = tf.nn.dropout(A2, pkeep)
Z3 = tf.add(tf.matmul(W3, A2d),b3)

return Z3

3)稍后调用tensorflow会话时(为清晰起见,在代码行之间省略):

X, Y, pkeep = create_placeholders(n_x, n_y)

Z3 = forward_propagation(X, parameters, pkeep)

sess.run([optimizer,cost], feed_dict={X:minibatch_X, Y:minibatch_Y, pkeep: 0.75})

上面会运行而不会出现任何错误。但是,我认为上述内容会将训练和测试运行的pkeep值设置为0.75。 Minibatching仅在列车数据集上完成,但我没有在其他地方设置pkeep值。

我想设置pkeep = 0.75进行培训,pkeep = 1.0进行测试。

4)当我做这样的事情时,确实会出错:

x_train_eval = Z3.eval(feed_dict={X: X_train, Y: Y_train,  pkeep: 0.75})

x_test_eval = Z3.eval(feed_dict={X: X_test, Y: Y_test,  pkeep: 1.0})

我收到的错误消息是:

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_2' with dtype float
     [[Node: Placeholder_2 = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

为培训和测试传递不同pkeep值的最佳方式是什么?您的建议将非常受欢迎。

2 个答案:

答案 0 :(得分:0)

假设你有一些定义为test_op的测试操作(例如,评估输入数据和标签的准确性的东西),你可以这样做:

for i in range(num_iters):
    # Run your training process.
    _, loss = sess.run([optimizer, cost],
                       feed_dict={X:minibatch_X, Y:minibatch_Y, pkeep: 0.75})
# Test the model after training
test_accuracy = sess.run(test_op,
                         feed_dict={X:test_X, Y:test_Y, pkeep: 1.0})

基本上,您在训练模型时没有对模型进行测试,因此您可以在准备好测试时调用test_op,为其提供不同的超参数,例如pkeep。在训练期间定期验证模型同样适用于保持的数据集:每次经常对保留的数据集运行您的评估操作,传递与训练期间使用的不同的超参数,然后您可以保存配置或提前停止根据您的验证准确性。

您的test_op可能会返回类似分类器预测准确度的内容:

correct = tf.equal(tf.argmax(y, 1), tf.argmax(predict, 1))
test_op = tf.reduce_mean(tf.cast(correct, tf.float32))

其中y是您的目标标签,predict是预测操作的名称。

答案 1 :(得分:0)

在神经网络中,您转发传播以计算logits(每个标签的概率)并将其与实际值进行比较以获得错误。

训练涉及通过向后传播最小化误差,即找到关于每个权重的误差导数,然后从权重值中减去该值。

当您分离培训和评估模型所需的操作时,应用不同的丢失参数会更容易。

培训只是将损失降到最低:

def loss(logits, labels):
    '''
    Calculate cross entropy loss
    '''
    cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
        labels=labels, logits=logits)
    return tf.reduce_mean(cross_entropy, name='loss_op')


def train(loss, learning_rate):
    '''
    Train model by optimizing gradient descent
    '''
    optimizer = tf.train.AdamOptimizer(learning_rate)
    train_op = optimizer.minimize(loss, name='train_op')
    return train_op

评估只是计算准确度:

def accuracy(logits, labels):
    '''
    Calculate accuracy of logits at predicting labels
    '''
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
    accuracy_op = tf.reduce_mean(tf.cast(correct_prediction, tf.float32),
                                 name='accuracy_op')
    return accuracy_op

然后在Tensorflow会话中,在生成占位符并向Graph等添加必要的操作之后,只需将不同的keep_prob_pl值提供给run方法:

# Train model
sess.run(train_op, feed_dict={x_pl: x_train, y_train: y, keep_prob_pl: 0.75}})

# Evaluate test data
batch_size = 100
epoch = data.num_examples // batch_size
acc = 0.0
for i in range(epoch):
    batch_x, batch_y = data.next_batch(batch_size)
    feed_dict = {x_pl: batch_x, y_pl: batch_y, keep_prob_pl: 1}
    acc += sess.run(accuracy_op, feed_dict=feed_dict)

print(("Epoch Accuracy: {:.4f}").format(acc / epoch))