我有一个由两个3层密集神经网络组成的A2C RL算法,并且正在Python 3中使用Keras + Tensorflow-GPU。我遇到了一个问题,即我的权重和偏见没有更新,因此开始探索每层的输出。演员网络如下:
def actor(self):
a_nn = Sequential()
a_nn.add(layers.Dense(50, input_shape=(5,),
kernel_initializer=initializers.truncated_normal(mean=0, stddev=(2 / math.sqrt(self.state_size * 50))),
bias_initializer=initializers.Constant(value=0.5), name='Actor_1'))
a_nn.add(LeakyReLU(alpha=0.2))
a_nn.add(layers.Dense(50,
kernel_initializer=initializers.truncated_normal(mean=0, stddev=(2 / math.sqrt(50 * 50))),
bias_initializer=initializers.Constant(value=0.5), name='Actor_2'))
a_nn.add(LeakyReLU(alpha=0.2))
a_nn.add(layers.Dense(50,
kernel_initializer=initializers.truncated_normal(mean=0, stddev=(2 / math.sqrt(50* 50))),
bias_initializer=initializers.Constant(value=0.5), name='Actor_3'))
a_nn.add(LeakyReLU(alpha=0.2))
a_nn.add(layers.Dense(21, kernel_initializer=initializers.truncated_normal(mean=0, stddev=(2 / math.sqrt(50 * 21))),
bias_initializer=initializers.Constant(value=0.5), activation='softmax', name='Actor_Output'))
a_nn.compile(loss='mse', optimizer=optimizers.Adam(lr=self.a_alpha))
return a_nn # output is the probability of each action
我发现,如果我删除第2层和第3层,而是使用输入网络(尺寸5)->密集层(尺寸50)->输出层(尺寸21),则密集层将产生输出没什么,只有NaN。我是Keras和RL的新手,但我相信我已经正确打印了第一个LeakyReLU层的输出:
print(self.sess.run(self.actor.layers[1].output, feed_dict={self.actor.input: state}))
状态为
[5.46094364e-03 0.00000000e+00 0.00000000e+00 7.48978632e+00 2.00000000e+03]
然后产生了它:
[[nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan]]
但是,如果我添加任何其他密集层,我最终都会得到数字!
[[221.13025 -32.324127 239.81203 29.123564 14.947907 -13.259543 -75.496254 52.49062 -97.64437 454.7144 14.617195 -36.815083 -54.008614 52.527546 -3.597932 -39.01585 -66.13489 -29.961552 -48.884182 28.370579 164.77858 158.79477 103.510185 12.60002 146.74603 103.54377 61.6077 -29.203367 450.80902 236.63414 -45.164257 -29.60965 66.94693 -49.735916 -17.434187 -41.16651 27.275759 104.64111 47.06206 315.57855 -70.642166 -25.824915 69.06004 -20.760962 -3.660415 439.28897 -44.686737 -15.48035 278.18478 -66.91201 ]]
我对致密层的理解是,它的计算方式为
matmul(输入,权重)+偏差
由于输入,权重和偏差不包含NaN且未进行除法运算,因此我不明白如何生成NaN或为什么添加后续的完全连接的层会影响前一层的输出。有人可以向我解释吗?谢谢!