我正在尝试在TensorFlow 中实现 dropout for一个简单的3层神经网络进行分类并遇到问题。更具体地说,我正在尝试在训练与测试时应用不同的丢失参数pkeep
值。
我采取的方法如下:
1)def create_placeholders(n_x,n_y):
X = tf.placeholder("float", [n_x, None])
Y = tf.placeholder("float", [n_y, None])
pkeep = tf.placeholder(tf.float32)
return X,Y,pkeep
2)在函数forward_propagation(X,parameters,pkeep)中,执行以下操作:
Z1 = tf.add(tf.matmul(W1, X), b1)
A1 = tf.nn.relu(Z1)
A1d = tf.nn.dropout(A1, pkeep)
Z2 = tf.add(tf.matmul(W2, A1d),b2)
A2 = tf.nn.relu(Z2)
A2d = tf.nn.dropout(A2, pkeep)
Z3 = tf.add(tf.matmul(W3, A2d),b3)
return Z3
3)稍后调用tensorflow会话时(为清晰起见,在代码行之间省略):
X, Y, pkeep = create_placeholders(n_x, n_y)
Z3 = forward_propagation(X, parameters, pkeep)
sess.run([optimizer,cost], feed_dict={X:minibatch_X, Y:minibatch_Y, pkeep: 0.75})
上面会运行而不会出现任何错误。但是,我认为上述内容会将训练和测试运行的pkeep
值设置为0.75。 Minibatching仅在列车数据集上完成,但我没有在其他地方设置pkeep
值。
我想设置pkeep = 0.75
进行培训,pkeep = 1.0
进行测试。
4)当我做这样的事情时,确实会出错:
x_train_eval = Z3.eval(feed_dict={X: X_train, Y: Y_train, pkeep: 0.75})
x_test_eval = Z3.eval(feed_dict={X: X_test, Y: Y_test, pkeep: 1.0})
我收到的错误消息是:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_2' with dtype float
[[Node: Placeholder_2 = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
为培训和测试传递不同pkeep
值的最佳方式是什么?您的建议将非常受欢迎。
答案 0 :(得分:0)
假设你有一些定义为test_op
的测试操作(例如,评估输入数据和标签的准确性的东西),你可以这样做:
for i in range(num_iters):
# Run your training process.
_, loss = sess.run([optimizer, cost],
feed_dict={X:minibatch_X, Y:minibatch_Y, pkeep: 0.75})
# Test the model after training
test_accuracy = sess.run(test_op,
feed_dict={X:test_X, Y:test_Y, pkeep: 1.0})
基本上,您在训练模型时没有对模型进行测试,因此您可以在准备好测试时调用test_op
,为其提供不同的超参数,例如pkeep
。在训练期间定期验证模型同样适用于保持的数据集:每次经常对保留的数据集运行您的评估操作,传递与训练期间使用的不同的超参数,然后您可以保存配置或提前停止根据您的验证准确性。
您的test_op
可能会返回类似分类器预测准确度的内容:
correct = tf.equal(tf.argmax(y, 1), tf.argmax(predict, 1))
test_op = tf.reduce_mean(tf.cast(correct, tf.float32))
其中y
是您的目标标签,predict
是预测操作的名称。
答案 1 :(得分:0)
在神经网络中,您转发传播以计算logits(每个标签的概率)并将其与实际值进行比较以获得错误。
训练涉及通过向后传播最小化误差,即找到关于每个权重的误差导数,然后从权重值中减去该值。
当您分离培训和评估模型所需的操作时,应用不同的丢失参数会更容易。
培训只是将损失降到最低:
def loss(logits, labels):
'''
Calculate cross entropy loss
'''
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
labels=labels, logits=logits)
return tf.reduce_mean(cross_entropy, name='loss_op')
def train(loss, learning_rate):
'''
Train model by optimizing gradient descent
'''
optimizer = tf.train.AdamOptimizer(learning_rate)
train_op = optimizer.minimize(loss, name='train_op')
return train_op
评估只是计算准确度:
def accuracy(logits, labels):
'''
Calculate accuracy of logits at predicting labels
'''
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy_op = tf.reduce_mean(tf.cast(correct_prediction, tf.float32),
name='accuracy_op')
return accuracy_op
然后在Tensorflow会话中,在生成占位符并向Graph等添加必要的操作之后,只需将不同的keep_prob_pl
值提供给run方法:
# Train model
sess.run(train_op, feed_dict={x_pl: x_train, y_train: y, keep_prob_pl: 0.75}})
# Evaluate test data
batch_size = 100
epoch = data.num_examples // batch_size
acc = 0.0
for i in range(epoch):
batch_x, batch_y = data.next_batch(batch_size)
feed_dict = {x_pl: batch_x, y_pl: batch_y, keep_prob_pl: 1}
acc += sess.run(accuracy_op, feed_dict=feed_dict)
print(("Epoch Accuracy: {:.4f}").format(acc / epoch))