早上好,我正在实现一个由两个唯一的一键编码标签组成的二进制分类模型。我终于到了程序输出某些东西的阶段,但可悲的是它一直在预测相同的类。有一次它没有这样做,所以我想知道问题是否在于程序很容易衰减到损失函数的局部最小值。或问题可能在于softmax激活的饱和度。变量是使用tf.truncated_normal初始化的,我一直在减小标准偏差,以防问题确实是softmax的饱和。图像是RGB。另外,我的计算机不是最有能力的,我正在运行50-100批图像(它们相当大,为480 * 704),并且历时约为20-40。
模型本身是:
with tf.Graph().as_default():
print("Creating Graph [{}]".format(datetime.datetime.now().strftime("%H:%M:%S")))
x = tf.placeholder(tf.float32, [None, 480, 704, 3])
y_true = tf.placeholder(tf.float32, [None, 2])
is_training = tf.placeholder(tf.bool, [])
with tf.name_scope("Conv_layers"):
conv_1 = conv_layer(x, [5, 5, 3, 2])
conv_pool_1 = max_pool_4x4(conv_1)
conv_2 = conv_layer(conv_pool_1, [5, 5, 2, 4])
conv_pool_2 = max_pool_2x2(conv_2)
conv_3 = conv_layer(conv_pool_2, [5, 5, 4, 8])
conv_pool_3 = max_pool_2x2(conv_3)
conv_4 = conv_layer(conv_pool_3, [5, 5, 8, 16])
conv_pool_4 = max_pool_2x2(conv_4)
to_flat = tf.reshape(conv_pool_4, [-1, 22*15*16])
full_1 = full_layer(to_flat, 1024)
y_conv = full_layer(full_1, 2)
y_conv = tf.cond(is_training, lambda: tf.identity(y_conv), lambda: tf.nn.softmax(y_conv))
损失函数和准确性:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_conv), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(0.03).minimize(cross_entropy)
correct_pred = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_true, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
使用的功能:
def weight_variables(shape):
initializer = tf.truncated_normal(shape=shape, stddev=0.05)
return tf.Variable(initializer)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
def max_pool_4x4(x):
return tf.nn.max_pool(x, ksize=[1, 4, 4, 1], strides=[1, 4, 4, 1], padding='SAME')
def conv_layer(input, shape):
W = weight_variables(shape=shape)
b = bias_variable([shape[3]])
return tf.nn.relu(conv2d(input, W))
def full_layer(input, size):
in_size = int(input.get_shape()[1])
W = weight_variables([in_size, size])
b = bias_variable([size])
return tf.matmul(input, W) + b
典型预测的一部分是:
[[0.28597853 0.71402144]
[0.28610235 0.71389765]
[0.28605604 0.713944 ]
[0.28603107 0.71396893]
[0.28613603 0.7138639 ]
[0.2860006 0.7139994 ]
[0.28612924 0.71387076]
[0.28628975 0.71371025]
[0.28614312 0.7138569 ]
[0.28609362 0.71390635]
[0.28626445 0.7137355 ]
[0.28617397 0.71382606]]
增加卷积层的大小使我的模型输出如下内容:
[[0. 1.]
[0. 1.]
[0. 1.]
[0. 1.]
[0. 1.]
[0. 1.]
[0. 1.]]