假设我有以下损失功能:
loss_a = tf.reduce_mean(my_loss_fn(model_output, targets))
loss_b = tf.reduce_mean(my_other_loss_fn(model_output, targets))
loss_final = loss_a + tf.multiply(alpha, loss_b)
为了将渐变的范数w.r.t可视化为loss_final
,可以做到这一点:
optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
grads_and_vars = optimizer.compute_gradients(loss_final)
grads, _ = list(zip(*grads_and_vars))
norms = tf.global_norm(grads)
gradnorm_s = tf.summary.scalar('gradient norm', norms)
train_op = optimizer.apply_gradients(grads_and_vars, name='train_op')
但是,我想将渐变的范数w.r.t分别绘制为loss_a
和loss_b
。如何以最有效的方式执行此操作?我是否必须分别在compute_gradients(..)
和loss_a
上调用loss_b
,然后将这两个渐变添加到optimizer.apply_gradients(..)
之前?我知道由于求和规则,这在数学上是正确的,但它看起来有点麻烦,我也不知道如何正确地实现梯度的求和。此外,loss_final
相当简单,因为它只是一个总和。如果loss_final
更复杂,例如分裂?
我正在使用Tensorflow 0.12。
答案 0 :(得分:11)
你是对的,结合渐变可能会变得混乱。而只是计算每个损失的梯度以及最终的损失。因为tensorflow在编译之前优化directed acyclic graph (DAG),所以这不会导致重复工作。
例如:
import tensorflow as tf
with tf.name_scope('inputs'):
W = tf.Variable(dtype=tf.float32, initial_value=tf.random_normal((4, 1), dtype=tf.float32), name='W')
x = tf.random_uniform((6, 4), dtype=tf.float32, name='x')
with tf.name_scope('outputs'):
y = tf.matmul(x, W, name='y')
def my_loss_fn(output, targets, name):
return tf.reduce_mean(tf.abs(output - targets), name=name)
def my_other_loss_fn(output, targets, name):
return tf.sqrt(tf.reduce_mean((output - targets) ** 2), name=name)
def get_tensors(loss_fn):
loss = loss_fn(y, targets, 'loss')
grads = tf.gradients(loss, W, name='gradients')
norm = tf.norm(grads, name='norm')
return loss, grads, norm
targets = tf.random_uniform((6, 1))
with tf.name_scope('a'):
loss_a, grads_a, norm_a = get_tensors(my_loss_fn)
with tf.name_scope('b'):
loss_b, grads_b, norm_b = get_tensors(my_loss_fn)
with tf.name_scope('combined'):
loss = tf.add(loss_a, loss_b, name='loss')
grad = tf.gradients(loss, W, name='gradients')
with tf.Session() as sess:
tf.global_variables_initializer().run(session=sess)
writer = tf.summary.FileWriter('./tensorboard_results', sess.graph)
res = sess.run([norm_a, norm_b, grad])
print(*res, sep='\n')
修改:回复您的评论......您可以check the DAG of a tensorflow model using tensorboard。我已更新代码以存储图表。
在终端中运行tensorboard --logdir $PWD/tensorboard_results
并导航到命令行上打印的网址(通常为http://localhost:6006/
)。然后单击GRAPH选项卡以查看DAG。您可以递归扩展张量,操作,命名空间以查看子图以查看单个操作及其输入。