我正在使用Tensorflow通过基于物理模拟的结果逐渐调整其输出来训练神经网络。网络的工作方式是这样的:我有一个输入到密集神经网络的随机噪声,而输出是来自tanh激活的,使向量在-1和1之间(在所示示例中为6)。数字并在物理模拟中使用它们,并根据结果,我想通过创建一个用于缩放的标签来对输出进行小的更改,该标签是通过缩放原始输出的一部分而生成的。我创建了两个python示例文件,总结了我遇到的问题。
例如,如果输出为:
[0.8, 0.4, -0.3, 0.1, -0.9, 0.5]
我将通过将原始输出与标量矢量相乘来创建标签,如下所示:
[1.1, 1.1, 1.1, 1.0, 1.0, 1.0] * [0.8, 0.4, -0.3, 0.1, -0.9, 0.5]
= [0.88, 0.44, -0.33, 0.1, -0.9, 0.5]
在使用ADAM优化器运行训练步骤以减少输出和标签之间的mse错误之后(上图),我希望看到输出(在输入相同的噪声的情况下)应该已经偏移得更接近标签。
这正是我从网络输出中计算标签并将整个计算出的标签作为占位符输入到tensorflow中时发生的情况,并且它按我期望的那样工作(下面的代码1)。
代码1输出:
initial output:
[[ 0.4489846 0.29872498 -0.6448472 -0.2561627 -0.26684356 0.5127055 ]]
label:
[[ 0.49388307 0.3285975 -0.7093319 -0.2561627 -0.26684356 0.5127055 ]]
trained output:
[[ 0.4666028 0.318578 -0.6672877 -0.25193644 -0.2668022 0.51966786]]
但是,我想将一个标量值输入到tensorflow中,并通过将标量值乘以网络输出的索引部分(下面的代码2)来在tensorflow中计算标签,而不是用numpy和feed计算整个标签用作占位符。使用这种在tensorflow中创建标签的方法进行训练时,网络输出会始终沿标签的相反方向移动,如下所示:
代码2输出:
initial output:
[[ 0.4517809 -0.48581237 0.0353935 0.19568376 -0.60063607 0.16790062]]
label:
[[ 0.49695897 -0.5343936 0.03893284 0.19568376 -0.60063607 0.16790062]]
trained output:
[[ 0.42061377 -0.46081907 0.02022755 0.1914451 -0.60077196 0.15687937]]
这对我来说毫无意义,我也不明白为什么会这样
我知道从计算时间/负载的角度来看,这并没有太大的区别,但是现在我认为我对张量流正在使成本函数最小化的做法有误解,因为我不明白为什么计算张量流中的标签,并以numpy外部计算标签并将其用作占位符会产生不同的结果。
所以最终我的问题是,为什么这两个.py文件(CODE 1和CODE 2)的结果不同?它们几乎相同,但是CODE 1输入整个标签作为进纸格dict值,CODE 2根据进纸dict的单个值计算张量流中的标签。
代码1(输入整个计算出的标签):
import tensorflow as tf
import numpy as np
def network(input):
hidden1 = tf.layers.dense(inputs=input, units=128, activation=tf.nn.relu)
hidden2 = tf.layers.dense(inputs=hidden1, units=128, activation=tf.nn.relu)
output = tf.layers.dense(inputs=hidden2, units=6, activation=tf.nn.tanh)
return output
noise_in = tf.placeholder(tf.float32, shape=[1, 100])
output = network(noise_in)
# placeholder is the entire label
label = tf.placeholder(tf.float32, shape=[1, 6])
# define loss
loss = tf.losses.mean_squared_error(labels=label, predictions=output)
optimizer = tf.train.AdamOptimizer(learning_rate=0.0001)
train = optimizer.minimize(loss)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
random_noise_input = np.random.random(100).reshape(1, -1)
# the initial output of the network
out = sess.run(output, feed_dict={noise_in: random_noise_input})
print("initial output: ")
print(out, "\n")
# define label
label_in = np.array(out)
label_in[:, 0:3] = label_in[:, 0:3] * 1.1
# print the label
out = sess.run(label, feed_dict={noise_in: random_noise_input,
label: label_in})
print("label: ")
print(out, "\n")
# train the network with this label
sess.run(train, feed_dict={noise_in: random_noise_input,
label: label_in})
# the output after the network is trained once
out = sess.run(output, feed_dict={noise_in: random_noise_input})
print("trained output: ")
print(out, "\n")
代码2(在tensorflow中计算标签):
import tensorflow as tf
import numpy as np
def network(input):
hidden1 = tf.layers.dense(inputs=input, units=128, activation=tf.nn.relu)
hidden2 = tf.layers.dense(inputs=hidden1, units=128, activation=tf.nn.relu)
output = tf.layers.dense(inputs=hidden2, units=6, activation=tf.nn.tanh)
return output
noise_in = tf.placeholder(tf.float32, shape=[1, 100])
output = network(noise_in)
reward_scaler = tf.placeholder(tf.float32, shape=[])
# multiply part of the output by a scalar value determined by the placeholder
output1 = tf.ones_like(output[:, :3]) * reward_scaler
output2 = tf.ones_like(output[:, 3:])
label = tf.concat([output1, output2], axis=1) * output
# define loss
loss = tf.losses.mean_squared_error(labels=label, predictions=output)
optimizer = tf.train.AdamOptimizer(learning_rate=0.0001)
train = optimizer.minimize(loss)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
random_noise_input = np.random.random(100).reshape(1, -1)
# the initial output of the network
out = sess.run(output, feed_dict={noise_in: random_noise_input})
print("initial output: ")
print(out, "\n")
# print the label
out = sess.run(label, feed_dict={noise_in: random_noise_input,
reward_scaler: 1.1})
print("label: ")
print(out, "\n")
# train the network with this label
sess.run(train, feed_dict={noise_in: random_noise_input,
reward_scaler: 1.1})
# the output after the network is trained once
out = sess.run(output, feed_dict={noise_in: random_noise_input})
print("trained output: ")
print(out, "\n")