Tensorflow中使用矩阵乘法创建ANN和tf.layers.dense()之间的区别

时间:2018-10-17 16:41:42

标签: tensorflow neural-network deep-learning

我尝试使用矩阵乘法和tf.layers.dense()训练ANN模型。但是我得到了不同的结果,使用矩阵乘法的ANN模型不能优化损失函数(损失增加)。两种方法有何不同?

使用矩阵乘法的ANN模型

{"id":1,"name":"Football","leagues":[{"id":1,"name":"fc","description":"Some Leauge","teams":[{"id":1,"name":"real madrid","successrate":null,"players":[{"id":1,"name":"Cristiano Ronaldo","age":21},{"id":2,"name":"Iniesta","age":38}]},{"id":2,"name":"Barcelona","successrate":null,"players":[{"id":1,"name":"Cristiano Ronaldo","age":21},{"id":2,"name":"Iniesta","age":38}]}]},{"id":2,"name":"al","description":"League","teams":[{"id":1,"name":"real madrid","successrate":null,"players":[{"id":1,"name":"Cristiano Ronaldo","age":21},{"id":2,"name":"Iniesta","age":38}]},{"id":2,"name":"Barcelona","successrate":null,"players":[{"id":1,"name":"Cristiano Ronaldo","age":21},{"id":2,"name":"Iniesta","age":38}]}]}]}

使用tf.layers.dense()的ANN模型

W1 = tf.Variable(tf.zeros([4,64]))
b1 = tf.Variable(tf.zeros([64]))
y1 = tf.nn.relu(tf.matmul(x, W1) + b1)

W2 = tf.Variable(tf.zeros([64,64]))
b2 = tf.Variable(tf.zeros([64]))
y2 = tf.nn.relu(tf.matmul(y1, W2) + b2)

W3 = tf.Variable(tf.zeros([64,64]))
b3 = tf.Variable(tf.zeros([64]))
y3 = tf.nn.relu(tf.matmul(y2, W3) + b3)

W4 = tf.Variable(tf.zeros([64,3]))
b4 = tf.Variable(tf.zeros([3]))
y_out = tf.nn.softmax(tf.matmul(y3, W4) + b4)

1 个答案:

答案 0 :(得分:2)

您正在使用零初始化权重,因为网络始终输出零,并且梯度始终为零,所以这有效地阻止了网络学习任何东西。

使用随机值初始化权重,例如较小范围(小于0.1)的均匀分布或高斯分布。