我确定我错过了一些明显的东西。这是我的代码的尾部:
# simple loss function
loss = tf.reduce_sum(tf.abs(tf.sub(x4, yn)))
train_step = tf.train.GradientDescentOptimizer(0.000001).minimize(loss)
with tf.Session() as sess:
tf.initialize_all_variables().run()
print(sess.run([tf.reduce_sum(w1), tf.reduce_sum(b1)]))
for i in range(5):
# fill in x1 and yn
sess.run(train_step, feed_dict={x1: in_images, yn: out_images})
print(sess.run([tf.reduce_sum(w1), tf.reduce_sum(b1)]))
从损失函数下降的网络是一个简单的CNN,包含conv2d和bias_adds,以及删除。我想看看第一层的权重和偏差是如何变化的。第一个打印返回预期值([+/- 100左右,0]),因为w1用随机法线初始化,b1用零初始化。
第二个print语句按预期提供不同的值对。
不期望的是每次循环时,第二个print语句打印相同的值对,就好像每次调用train_step每次都做同样的事情,而不是更新变量的值在损失网络中。
我在这里缺少什么?
这里是一个有趣的部分的剪切和粘贴:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0)
[-50.281082, 0.0]
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 3.98GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
[112.52832, 0.078026593]
[112.52832, 0.078026593]
[112.52832, 0.078026593]
[112.52832, 0.078026593]
[112.52832, 0.078026593]
如果有必要,我可以发布网络本身,但我怀疑问题是我的心理模型如何更新tensorflow状态。
这是整个python程序,图像输入的虚拟例程显示问题:
import tensorflow as tf
import numpy as np
from scipy import misc
H = 128
W = 128
x1 = tf.placeholder(tf.float32, [None, H, W, 1], "input_image")
yn = tf.placeholder(tf.float32, [None, H-12, W-12, 1], "test_image")
w1 = tf.Variable(tf.random_normal([7, 7, 1, 64])) # 7x7, 1 input chan, 64 output chans
b1 = tf.Variable(tf.constant(0.1, shape=[64]))
x2 = tf.nn.conv2d(x1, w1, [1,1,1,1], "VALID")
x2 = tf.nn.bias_add(x2, b1)
x2 = tf.nn.elu(x2)
w2 = tf.Variable(tf.random_normal([5, 5, 64, 32])) # 5x5, 64 input 32 output chans
b2 = tf.Variable(tf.constant(0.1, shape=[32]))
x3 = tf.nn.conv2d(x2, w2, [1,1,1,1], "VALID")
x3 = tf.nn.bias_add(x3, b2)
x3 = tf.nn.elu(x3)
w3 = tf.Variable(tf.random_normal([3, 3, 32, 1]))
b3 = tf.Variable(tf.constant(0.1, shape=[1]))
x4 = tf.nn.conv2d(x3, w3, [1,1,1,1], "VALID")
x4 = tf.nn.bias_add(x4, b3)
x4 = tf.nn.elu(x4)
loss = tf.reduce_sum(tf.abs(tf.sub(x4, yn)))
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss)
# fake for testing
in_images = np.random.rand(20, 128, 128, 1)
out_images = np.random.rand(20, 116, 116, 1)
with tf.Session() as sess:
tf.initialize_all_variables().run()
print(sess.run([tf.reduce_mean(w1), tf.reduce_mean(b1)]))
for i in range(5):
# fill in x1 and yn
sess.run(train_step, feed_dict={x1: in_images, yn: out_images})
print(sess.run([tf.reduce_mean(w1), tf.reduce_mean(b1)]))
我看了一堆其他训练样例,但我还没有看到我做错了什么。更改学习速率只会改变打印的数字,但行为保持不变,运行优化器没有明显的变化。
答案 0 :(得分:1)
错误是我计算损失函数的方式。我只是在批处理中添加了所有错误,而不是为每对图像取平均误差。以下损失函数
# simple loss function
diff_image = tf.abs(tf.sub(x4,yn))
# sum over all dimensions except batch dim
err_sum = tf.reduce_sum(diff_image, [1,2,3])
#take mean over batch
loss = tf.reduce_mean(err_sum)
实际上开始与AdamOptimizer融合。 GradientDescentOptimizer仍然展示了“仅更改一次”功能,我将把它视为一个bug并在github上发布。