如何通过model.predict()跟踪梯度?

时间:2019-07-03 02:12:38

标签: python tensorflow machine-learning keras

我正在尝试创建一个自定义损失函数,该函数依赖于模型输出的logits作为优化程序最小化任务的中间步骤。但是,我不断收到一条错误消息,提示我尝试优化的输入perturbation与损失函数之间的某些操作不支持渐变:

ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'perturbation:0' shape=(1, 28, 28, 1) dtype=float32, numpy=\narray([[[[0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.]],\n\n        [[0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.]],\n\n        [[0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],\n         [0.],...

我高度怀疑这是由于调用model.predict()所致,因为它以numpy的形式返回,而不是可用来跟踪渐变的张量。如果这是真正的罪魁祸首,如何通过该函数跟踪梯度,以便可以适当地最小化损失函数?

我尝试将output包装在tf.Variable()output = tf.Variable(model.predict(newimg, steps=num_steps), dtype=tf.float32, trainable=False, name="output"))中,以查看是否有帮助,但没有帮助。

模型

train_temperature = 1

# create model layers
model = k.models.Sequential([
  # flatten into a single vector
  k.layers.Flatten(input_shape=(28, 28, 1), name='input'),
  # first layer
  k.layers.Dense(512, activation=tf.nn.relu, name='dense_1'),
  k.layers.Dropout(0.2, name='dropout_1'),
  # second layer
  k.layers.Dense(256, activation=tf.nn.relu, name='dense_2'),
  k.layers.Dropout(0.2, name='dropout_2'),
  # third layer
  k.layers.Dense(128, activation=tf.nn.relu, name='dense_3'),
  k.layers.Dropout(0.2, name='dropout_3'),
  # fourth layer
  k.layers.Dense(64, activation=tf.nn.relu, name='dense_4'),
  k.layers.Dropout(0.2, name='dropout_4'),
  # fifth layer
  k.layers.Dense(20, activation=tf.nn.relu, name='dense_5'),
  k.layers.Dropout(0.2, name='dropout_5'),
  # sixth layer
  k.layers.Dense(10, name='dense_6')
])

def fn(correct, predicted):
    return tf.nn.softmax_cross_entropy_with_logits_v2(labels=correct, logits=predicted/train_temperature)

# compile with optimizer, loss, and metrics
model.compile(optimizer='adam',
              loss=fn,
              metrics=['accuracy'])

# fit
model.fit(x_train, y_train, epochs=2, shuffle=True)

# run
model.evaluate(x_test, y_test)

问题代码

# the variable to optimize
perturbation = tf.Variable(np.zeros(input_shape, dtype=np.float32), trainable=True, name="perturbation")

# other variables
img = tf.Variable(np.zeros(input_shape), dtype=tf.float32, trainable=False, name="img")
label = tf.Variable(np.zeros((batch_size, num_labels)), dtype=tf.float32, trainable=False, name="label")

# the resulting adversarial image, tanh'd to keep bounded from boxmin to boxmax
boxmul = (boxmax - boxmin) / 2.
boxplus = (boxmin + boxmax) / 2.
newimg = tf.tanh(img + perturbation) * boxmul + boxplus

# prediction BEFORE-SOFTMAX of the model
output = model.predict(newimg, steps=num_steps)

# compute the probability of the label class versus the maximum other
real = tf.reduce_sum((label)*output, 1)
other = tf.reduce_max((1-label)*output - (label*10000), 1)

# compute loss
def loss():
  loss1 = tf.maximum(0.0, real - other)
  loss2 = 0 # TODO: add the KL-Divergence regularizer term here
  loss = loss1 + reg * loss2
  return loss

# setup the adam optimizer 
optimizer = tf.train.AdamOptimizer(step_size)
train = optimizer.minimize(loss, var_list=[perturbation]) 

0 个答案:

没有答案