Question

我正在试验Jupyter上的一些代码并继续陷入困境。如果我删除以“optimizer = ...”开头的行以及对此行的所有引用，那么事情确实很好。但是，如果我将这一行放在代码中，则会出错。

我没有在这里粘贴所有其他函数以将代码的大小保持在可读级别。我希望有经验的人可以立刻看到它的问题。

请注意，输入图层，2个隐藏图层和输出图层中有5个，4个，3个和2个单位。

CODE：

tf.reset_default_graph()

num_units_in_layers = [5,4,3,2]

X = tf.placeholder(shape=[5, 3], dtype=tf.float32)
Y = tf.placeholder(shape=[2, 3], dtype=tf.float32)
parameters = initialize_layer_parameters(num_units_in_layers)
init = tf.global_variables_initializer() 

my_sess = tf.Session()
my_sess.run(init)
ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters, my_sess)
#my_sess.run(parameters)  # Do I need to run this? Or is it obsolete?

cost = compute_cost(ZL, Y, my_sess, parameters, batch_size=3, lambd=0.05)
optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
_ , minibatch_cost = my_sess.run([optimizer, cost], 
                                 feed_dict={X: minibatch_X, 
                                            Y: minibatch_Y})

print(minibatch_cost)
my_sess.close()

ERROR：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-321-135b9fc18268> in <module>()
     16 cost = compute_cost(ZL, Y, my_sess, parameters, 3, 0.05)
     17 
---> 18 optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
     19 _ , minibatch_cost = my_sess.run([optimizer, cost], 
     20                                  feed_dict={X: minibatch_X, 

~/.local/lib/python3.5/site-packages/tensorflow/python/training/optimizer.py in minimize(self, loss, global_step, var_list, gate_gradients, aggregation_method, colocate_gradients_with_ops, name, grad_loss)
    362           "No gradients provided for any variable, check your graph for ops"
    363           " that do not support gradients, between variables %s and loss %s." %
--> 364           ([str(v) for _, v in grads_and_vars], loss))
    365 
    366     return self.apply_gradients(grads_and_vars, global_step=global_step,

ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'weights/W1:0' shape=(4, 5) dtype=float32_ref>", "<tf.Variable 'biases/b1:0' shape=(4, 1) dtype=float32_ref>", "<tf.Variable 'weights/W2:0' shape=(3, 4) dtype=float32_ref>", "<tf.Variable 'biases/b2:0' shape=(3, 1) dtype=float32_ref>", "<tf.Variable 'weights/W3:0' shape=(2, 3) dtype=float32_ref>", "<tf.Variable 'biases/b3:0' shape=(2, 1) dtype=float32_ref>"] and loss Tensor("Add_3:0", shape=(), dtype=float32).

请注意，如果我运行

print(tf.trainable_variables())

就在“optimizer = ...”行之前，我实际上在那里看到了我的可训练变量。

hts/W1:0' shape=(4, 5) dtype=float32_ref>, <tf.Variable 'biases/b1:0' shape=(4, 1) dtype=float32_ref>, <tf.Variable 'weights/W2:0' shape=(3, 4) dtype=float32_ref>, <tf.Variable 'biases/b2:0' shape=(3, 1) dtype=float32_ref>, <tf.Variable 'weights/W3:0' shape=(2, 3) dtype=float32_ref>, <tf.Variable 'biases/b3:0' shape=(2, 1) dtype=float32_ref>]

有人会对可能出现的问题有所了解吗？

编辑和添加更多信息：万一你想看看我如何创建＆amp;初始化我的参数，这是代码。也许这部分有问题，但我看不出是什么......

def get_nn_parameter(variable_scope, variable_name, dim1, dim2):
  with tf.variable_scope(variable_scope, reuse=tf.AUTO_REUSE):
    v = tf.get_variable(variable_name, 
                        [dim1, dim2], 
                        trainable=True, 
                        initializer = tf.contrib.layers.xavier_initializer())
  return v


def initialize_layer_parameters(num_units_in_layers):
    parameters = {}
    L = len(num_units_in_layers)

    for i in range (1, L):
        temp_weight = get_nn_parameter("weights",
                                       "W"+str(i), 
                                       num_units_in_layers[i], 
                                       num_units_in_layers[i-1])
        parameters.update({"W" + str(i) : temp_weight})  
        temp_bias = get_nn_parameter("biases",
                                     "b"+str(i), 
                                     num_units_in_layers[i], 
                                     1)
        parameters.update({"b" + str(i) : temp_bias})  

    return parameters

＃

附录

我得到了它的工作。我没有写一个单独的答案，而是在这里添加正确版本的代码。

（David的答案在下面有很多帮助。）

我只是将my_sess作为参数删除到了我的compute_cost函数。（我以前无法使它工作，但看起来根本不需要它。）而且我还在主函数中重新排序语句，以正确的顺序调用事物。

以下是我的费用函数的工作版本以及我如何称呼它：

def compute_cost(ZL, Y, parameters, mb_size, lambd):

    logits = tf.transpose(ZL)
    labels = tf.transpose(Y)

    cost_unregularized = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits = logits, labels = labels))

    #Since the dict parameters includes both W and b, it needs to be divided with 2 to find L
    L = len(parameters) // 2

    list_sum_weights = []

    for i in range (0, L):
        list_sum_weights.append(tf.nn.l2_loss(parameters.get("W"+str(i+1))))

    regularization_effect = tf.multiply((lambd / mb_size), tf.add_n(list_sum_weights))
    cost = tf.add(cost_unregularized, regularization_effect)

    return cost

这是我调用compute_cost（..）函数的主要函数：

tf.reset_default_graph()

num_units_in_layers = [5,4,3,2]

X = tf.placeholder(shape=[5, 3], dtype=tf.float32)
Y = tf.placeholder(shape=[2, 3], dtype=tf.float32)
parameters = initialize_layer_parameters(num_units_in_layers)

my_sess = tf.Session()
ZL = forward_propagation_with_relu(X, num_units_in_layers, parameters)

cost = compute_cost(ZL, Y, parameters, 3, 0.05)
optimizer =  tf.train.AdamOptimizer(learning_rate = 0.001).minimize(cost)
init = tf.global_variables_initializer() 

my_sess.run(init)
_ , minibatch_cost = my_sess.run([optimizer, cost], 
                                 feed_dict={X: [[-1.,4.,-7.],[2.,6.,2.],[3.,3.,9.],[8.,4.,4.],[5.,3.,5.]], 
                                            Y: [[0.6, 0., 0.3], [0.4, 0., 0.7]]})


print(minibatch_cost)

my_sess.close()

Answer 1

我99.9％确定您错误地创建了费用功能。

<= 3.99

您的成本函数应该是张量。您正在将会话传递给成本函数，看起来它实际上正在尝试运行张量流量会话，这非常错误。

之后您将cost = compute_cost(ZL, Y, my_sess, parameters, batch_size=3, lambd=0.05)的结果传递给最小化器。

这是对张量流的常见误解。

Tensorflow是一种声明性编程范例，这意味着您首先声明要运行的所有操作，然后再传入数据并运行它。

重构您的代码以严格遵循此最佳做法：

（1）创建一个compute_cost函数，在此函数中应放置所有数学运算。您应该定义成本函数和网络的所有层。返回build_graph()培训操作（以及您可能想要获得的任何其他OP，例如准确性）。

（2）现在创建一个会话。

（3）在此之后，如果您觉得自己需要做错事，请不要再创建张量流操作或变量。

（4）在train_op上调用sess.run，并通过optimize.minimize()传递占位符数据。

以下是如何构建代码的简单示例：

https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/neural_network_raw.ipynb

总的来说，有一些非常好的例子是由aymericdamien提出来的，我强烈建议您查看它们以了解张量流的基本知识。

Tensorflow - 没有为任何变量

1 个答案: