我正在使用Google Colab和Tensorflow一起玩。我正在尝试评估手工创建的简单感知器,而不是将Keras与急切的执行模式结合使用。
感知器期望输入(1x2)张量,共有两层,包括以下权重和偏差W1:(2x2)
/ B1:(1x2)
和W2:(2x1)
/ B2:(1x1)
,>
我发现这条简单的代码毫无道理地失败了。似乎与优化程序有关,我尝试过的每个优化程序都因不同的错误而失败。例如,对于下面使用的优化器(GradientDescentOptimizer),Tensorflow表示未实现该操作,我不知道为什么。这是一段自给自足的代码(Tensorflow 1.13.1 / Python3):
import numpy as np
import tensorflow as tf
import tensorflow.contrib.eager as tfe
tf.enable_eager_execution()
with tf.device("GPU:0"):
W1 = tf.random_uniform([2, 2], -1, 1, tf.float32)
B1 = tf.random_uniform([1, 2], -1, 1, tf.float32)
W2 = tf.random_uniform([2, 1], -1, 1, tf.float32)
B2 = tf.random_uniform([1, 1], -1, 1, tf.float32)
X0 = tf.convert_to_tensor(np.array([[0, 0]]), tf.float32)
with tf.GradientTape() as tape:
tape.watch(W1)
tape.watch(B1)
tape.watch(W2)
tape.watch(B2)
X1 = tf.sigmoid(tf.matmul(X0, W1) + B1)
X2 = tf.sigmoid(tf.matmul(X1, W2) + B2)
Loss = tf.square(X2 - tf.constant([[1]], tf.float32))
dLoss_dParams = tape.gradient(Loss, [W1, B1, W2, B2])
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
optimizer.apply_gradients(zip(dLoss_dParams, [W1, B1, W2, B2]), tf.Variable(0))
我在做什么错了?
提前谢谢!
答案 0 :(得分:1)
好,以防万一其他人遇到同样的问题。按照@jdehesa在评论中的回答,结果代码如下所示(我已经更新了原始代码,现在感知器正在尝试解决xor问题):
import numpy as np
import tensorflow as tf
import tensorflow.contrib.eager as tfe
tf.enable_eager_execution()
optimizer = tf.train.AdamOptimizer()
with tf.device("GPU:0"):
X0 = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], np.float32) # 4x2
W1 = tf.Variable(tf.random_uniform([2, 2], -1.0, 1.0, tf.float32)) # 4x2 * 2x2 => 4x2
B1 = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0, tf.float32)) # 4x2 + 1x2 => 4x2
W2 = tf.Variable(tf.random_uniform([2, 1], -1.0, 1.0, tf.float32)) # 4x2 * 2x1 => 4x1
B2 = tf.Variable(tf.random_uniform([1, 1], -1.0, 1.0, tf.float32)) # 4x1 + 1x1 => 4x1
with tf.GradientTape() as tape:
# tape.watch(W1)
# tape.watch(B1)
# tape.watch(W2)
# tape.watch(B2)
X1 = tf.tanh(tf.matmul(X0, W1) + B1)
X2 = tf.tanh(tf.matmul(X1, W2) + B2)
Loss = tf.square(X2 - tf.constant([[0], [1], [1], [0]], tf.float32))
dLoss_dParams = tape.gradient(Loss, [W1, B1, W2, B2])
optimizer.apply_gradients(zip(dLoss_dParams, [W1, B1, W2, B2]))
print(Loss.numpy()[0][0])
for i in range(10000):
with tf.GradientTape() as tape:
X1 = tf.tanh(tf.matmul(X0, W1) + B1)
X2 = tf.tanh(tf.matmul(X1, W2) + B2)
Loss = tf.reduce_mean(tf.square(X2 - tf.constant([[0], [1], [1], [0]], tf.float32)))
dLoss_dParams = tape.gradient(Loss, [W1, B1, W2, B2])
optimizer.apply_gradients(zip(dLoss_dParams, [W1, B1, W2, B2]))
if i % 1000 == 0:
print(Loss.numpy())
X1 = tf.tanh(tf.matmul(X0, W1) + B1)
X2 = tf.tanh(tf.matmul(X1, W2) + B2)
print(X2.numpy()[0][0])
print(X2.numpy()[1][0])
print(X2.numpy()[2][0])
print(X2.numpy()[3][0])