Tensorflow在惯用的简单代码上失败了

时间:2019-04-23 02:33:42

标签: python python-3.x tensorflow

我正在使用Google Colab和Tensorflow一起玩。我正在尝试评估手工创建的简单感知器,而不是将Keras与急切的执行模式结合使用。

感知器期望输入(1x2)张量,共有两层,包括以下权重和偏差W1:(2x2) / B1:(1x2)W2:(2x1) / B2:(1x1)

我发现这条简单的代码毫无道理地失败了。似乎与优化程序有关,我尝试过的每个优化程序都因不同的错误而失败。例如,对于下面使用的优化器(GradientDescentOptimizer),Tensorflow表示未实现该操作,我不知道为什么。这是一段自给自足的代码(Tensorflow 1.13.1 / Python3):

import numpy as np
import tensorflow as tf
import tensorflow.contrib.eager as tfe

tf.enable_eager_execution()
with tf.device("GPU:0"):
  W1 = tf.random_uniform([2, 2], -1, 1, tf.float32)
  B1 = tf.random_uniform([1, 2], -1, 1, tf.float32)

  W2 = tf.random_uniform([2, 1], -1, 1, tf.float32)
  B2 = tf.random_uniform([1, 1], -1, 1, tf.float32)

  X0 = tf.convert_to_tensor(np.array([[0, 0]]), tf.float32)

  with tf.GradientTape() as tape:
    tape.watch(W1)
    tape.watch(B1)
    tape.watch(W2)
    tape.watch(B2)

    X1 = tf.sigmoid(tf.matmul(X0, W1) + B1)
    X2 = tf.sigmoid(tf.matmul(X1, W2) + B2)

    Loss = tf.square(X2 - tf.constant([[1]], tf.float32))

  dLoss_dParams = tape.gradient(Loss, [W1, B1, W2, B2])  

optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
optimizer.apply_gradients(zip(dLoss_dParams, [W1, B1, W2, B2]), tf.Variable(0))

enter image description here

我在做什么错了?

提前谢谢!

1 个答案:

答案 0 :(得分:1)

好,以防万一其他人遇到同样的问题。按照@jdehesa在评论中的回答,结果代码如下所示(我已经更新了原始代码,现在感知器正在尝试解决xor问题):

import numpy as np
import tensorflow as tf
import tensorflow.contrib.eager as tfe

tf.enable_eager_execution()

optimizer = tf.train.AdamOptimizer()

with tf.device("GPU:0"):
  X0 = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], np.float32) # 4x2

  W1 = tf.Variable(tf.random_uniform([2, 2], -1.0, 1.0, tf.float32)) # 4x2 * 2x2 => 4x2
  B1 = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0, tf.float32)) # 4x2 + 1x2 => 4x2

  W2 = tf.Variable(tf.random_uniform([2, 1], -1.0, 1.0, tf.float32)) # 4x2 * 2x1 => 4x1
  B2 = tf.Variable(tf.random_uniform([1, 1], -1.0, 1.0, tf.float32)) # 4x1 + 1x1 => 4x1

  with tf.GradientTape() as tape:
    #     tape.watch(W1)
    #     tape.watch(B1)
    #     tape.watch(W2)
    #     tape.watch(B2)

    X1 = tf.tanh(tf.matmul(X0, W1) + B1)
    X2 = tf.tanh(tf.matmul(X1, W2) + B2)

    Loss = tf.square(X2 - tf.constant([[0], [1], [1], [0]], tf.float32))

  dLoss_dParams = tape.gradient(Loss, [W1, B1, W2, B2])
  optimizer.apply_gradients(zip(dLoss_dParams, [W1, B1, W2, B2]))
  print(Loss.numpy()[0][0]) 

for i in range(10000):
  with tf.GradientTape() as tape:
    X1 = tf.tanh(tf.matmul(X0, W1) + B1)
    X2 = tf.tanh(tf.matmul(X1, W2) + B2)

    Loss = tf.reduce_mean(tf.square(X2 - tf.constant([[0], [1], [1], [0]], tf.float32)))

  dLoss_dParams = tape.gradient(Loss, [W1, B1, W2, B2])  
  optimizer.apply_gradients(zip(dLoss_dParams, [W1, B1, W2, B2]))

  if i % 1000 == 0:
    print(Loss.numpy())

X1 = tf.tanh(tf.matmul(X0, W1) + B1)
X2 = tf.tanh(tf.matmul(X1, W2) + B2)

print(X2.numpy()[0][0])
print(X2.numpy()[1][0])
print(X2.numpy()[2][0])
print(X2.numpy()[3][0])