我正在尝试解决Kaggle上的“狗品种识别”挑战。我正在使用tensorflow slim来训练alexnet以解决问题。但是我一直遇到问题。损失不断增加而不是减少。我已经测试了我的预处理程序,看起来还不错。学习率也不是很高。这是我编写代码Kaggle Dog Breed Kernel的kaggle内核。如果您不能使用kaggle,我还将在下面提供我用于培训的代码。
with graph.as_default():
tf_train_dataset = tf.placeholder(tf.float32, shape=(batchsize, height, width,channels))
tf_train_labels = tf.placeholder(tf.float32, shape=(batchsize, num_labels))
tf_validation_dataset = tf.constant(valid_dataset,dtype=tf.float32)
#tf_validation_dataset = tf.placeholder(tf.float32, shape=(valid_dataset.shape[0], height, width,channels))
tf_test_dataset = tf.constant(test_dataset)
def model(data,is_training=True, reusevar = False):
with slim.arg_scope(alexnet.alexnet_v2_arg_scope()):
with tf.variable_scope('model') as scope:
if reusevar:
scope.reuse_variables()
outputs, end_points = alexnet.alexnet_v2(data, num_classes=num_labels, is_training=is_training)
else:
outputs, end_points = alexnet.alexnet_v2(data, num_classes=num_labels, is_training=is_training)
return outputs
logits = model(tf_train_dataset)
#calculate loss
with tf.name_scope('loss'):
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits))
#optimization Step
optimizer = tf.train.AdamOptimizer(1e-4).minimize(loss)
#predictions for each dataset
train_predictions = tf.nn.softmax(logits)
valid_predictions = tf.nn.softmax(model(tf_validation_dataset,is_training=False,reusevar=True))
num_steps = 1001
with tf.Session(graph=graph) as session:
tf.global_variables_initializer().run()
print('Initialized')
for step in range(num_steps):
offset = (step * batchsize) % (train_data_labels.shape[0] - batchsize)
batch_data = train_data[offset:(offset + batchsize), :, :, :]
batch_labels = train_data_labels[offset:(offset + batchsize), :]
feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
_, l, predictions = session.run(
[optimizer, loss, train_predictions], feed_dict=feed_dict)
if (step % 2 == 0):
print('Minibatch loss at step %d: %f' % (step, l))
print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
print('Validation accuracy: %.1f%%' % accuracy(
valid_predictions.eval(), valid_data_labels))
我似乎无法弄清楚问题出在预处理还是培训上。 我正在运行的另一个问题是,当我将所有图像加载到程序中时,jpg格式的数据为364 Mb。 numpy阵列大约需要13gb的内存,其中包含图像数据及其相应的标签矢量。这是正常现象还是我在犯错误?现在,我仅使用约2000张图像进行训练。