TensorFlow:"没有为任何变量提供渐变"和partial_run

时间:2017-02-28 02:04:12

标签: python tensorflow

问题

使用Tensorflow的partial_run()方法并不像我预期的那样工作。我在提供的代码的底部使用它,我相信它给了我附加的错误。

数据的一般流程是我需要从模型中获得预测,在一些非张量流代码(编程软件合成器)中使用该预测,然后在播放后获得音频特征(MFCCS,RMS,FFT)一个midi音符,可以最终传递给成本函数,以检查预测音色与重新创建当前示例提供的所需声音的接近程度。

代码 - 省略预处理

# Create the tensorflow graph.
dimension_data_example = generate_examples(1,
                                           midi_note,
                                           midi_velocity,
                                           note_length,
                                           render_length,
                                           engine,
                                           generator,
                                           mfcc_normaliser,
                                           rms_normaliser)

features, parameters = dimension_data_example[0]
# https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/recurrent_network.ipynb
# Parameters for the tensorflow graph.
learning_rate = 0.001
training_iters = 256
batch_size = 128
display_step = 10
number_hidden_1 = 128
number_hidden_2 = 128

# Network parameters:
# 14 * 181 - (amount of mfccs + rms value) * sample size
number_input = int(features.shape[0])

# 155 - amount of parameters
number_outputs = len(parameters)

x = tf.placeholder("float", [None, number_input])

# Create model
def multilayer_perceptron(x, weights, biases):
    # Hidden layer with RELU activation
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    # Hidden layer with RELU activation
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    # Output layer with linear activation
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return out_layer

# Store layers weight & bias
weights = {
    'h1': tf.Variable(tf.random_normal([number_input, number_hidden_1])),
    'h2': tf.Variable(tf.random_normal([number_hidden_1, number_hidden_2])),
    'out': tf.Variable(tf.random_normal([number_hidden_2, number_outputs]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([number_hidden_1])),
    'b2': tf.Variable(tf.random_normal([number_hidden_2])),
    'out': tf.Variable(tf.random_normal([number_outputs]))
}

# Construct model
prediction = multilayer_perceptron(x, weights, biases)

x_original = tf.placeholder("float", [None, number_input])
x_from_y = tf.placeholder("float", [None, number_input])
cost = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(x_original, x_from_y))))
optimiser = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

# Launching the graph
with tf.Session() as sess:

    sess.run(init)
    step = 1

    while step * batch_size < training_iters:

        train_batch = generate_examples(batch_size,
                                        midi_note,
                                        midi_velocity,
                                        note_length,
                                        render_length,
                                        engine,
                                        generator,
                                        mfcc_normaliser,
                                        rms_normaliser)
        split_train = map(list, zip(*train_batch))
        batch_x = split_train[0]

        setup = sess.partial_run_setup([prediction, optimiser],
                                       [x, x_original, x_from_y])

        pred = sess.partial_run(setup, prediction, feed_dict={x: batch_x})

        features_from_prediction = get_features(pred,
                                                midi_note,
                                                midi_velocity,
                                                note_length,
                                                render_length)

        sess.partial_run(setup, optimiser, feed_dict={x_original: batch_x,
                                                      x_from_y: features_from_prediction})

错误

Traceback (most recent call last):
  File "model.py", line 255, in <module>
    optimiser = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 276, in minimize
    ([str(v) for _, v in grads_and_vars], loss))
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ['Tensor("Variable/read:0", shape=(2534, 128), dtype=float32)', 'Tensor("Variable_1/read:0", shape=(128, 128), dtype=float32)', 'Tensor("Variable_2/read:0", shape=(128, 155), dtype=float32)', 'Tensor("Variable_3/read:0", shape=(128,), dtype=float32)', 'Tensor("Variable_4/read:0", shape=(128,), dtype=float32)', 'Tensor("Variable_5/read:0", shape=(155,), dtype=float32)'] and loss Tensor("Sqrt:0", shape=(), dtype=float32).

1 个答案:

答案 0 :(得分:9)

您遇到的即时错误:

  

没有为任何变量提供渐变,请检查图表中不支持渐变的ops,变量之间

是因为从您的cost到权重没有渐变路径。这是因为在图表之外的权重和成本之间存在占位符和计算。因此,不存在从成本到权重的渐变路径。

换句话说,考虑一下设置。

Weights -> prediction -> get_features -> calculate cost.

现在,考虑反向传播,我们可以计算成本的梯度,但是我们没有从成本到get_features或从get_features到预测的梯度,因为get_features不是图的一部分:

Weights <- prediction <-/- get_features <-/- calculate cost.

因此,权重永远无法学习。 如果您希望此设置工作,您可能需要以某种方式获得从成本回到预测的路径,可能在图的向后路径中模拟get_features的渐变。可能有一种更清洁的方式,但我现在想不到一个。

希望有所帮助!