问题:我正在从保存的检查点加载一个简单的VGG16。我想在推理过程中为图像生成显着性。当我计算为此所需的(损失wrt输入图像)的梯度时,我将所有梯度取为零。关于我在这里缺少什么的任何想法都非常感谢!
tf版本: tensorflow-2.0alpha-gpu
模型:
import tensorflow as tf
from tensorflow.keras.applications.vgg16 import VGG16 as KerasVGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Flatten, Dense
class VGG16(Model):
def __init__(self, num_classes, use_pretrained=True):
super(VGG16, self).__init__()
self.num_classes = num_classes
self.use_pretrained = use_pretrained
if use_pretrained:
self.base_model = KerasVGG16(weights='imagenet', include_top=False)
for layer in self.base_model.layers:
layer.trainable = False
else:
self.base_model = KerasVGG16(include_top=False)
self.flatten1 = Flatten(name='flatten')
self.dense1 = Dense(4096, activation='relu', name='fc1')
self.dense2 = Dense(100, activation='relu', name='fc2')
self.dense3 = Dense(self.num_classes, activation='softmax', name='predictions')
def call(self, inputs):
x = self.base_model(tf.cast(inputs, tf.float32))
x = self.flatten1(x)
x = self.dense1(x)
x = self.dense2(x)
x = self.dense3(x)
return x
我训练该模型并将其保存到检查点,并通过以下方式加载回去:
model = VGG16(num_classes=2, use_pretrained=False)
checkpoint = tf.train.Checkpoint(net=model)
status = checkpoint.restore(tf.train.latest_checkpoint('./my_checkpoint'))
status.assert_consumed()
我确认砝码已正确装入。
获取测试图像
# load my image and make sure its float
img = tf.convert_to_tensor(image, dtype=tf.float64)
support_class = tf.convert_to_tensor(support_class, dtype=tf.float64)
获取渐变:
with tf.GradientTape(persistent=True) as g_tape:
g_tape.watch(img)
#g_tape.watch(model.base_model.trainable_variables)
#g_tape.watch(model.trainable_variables)
loss = tf.losses.CategoricalCrossentropy()(support_class, model(img))
gradients_wrt_image = g_tape.gradient(loss,
img, unconnected_gradients=tf.UnconnectedGradients.NONE)
当我检查渐变时,它们全为零!知道我想念什么吗?预先感谢!
答案 0 :(得分:1)
梯度很小,但不为零:
###Opens each file in New_Reports folder and modifies content#####
Get-ChildItem $New_Reports -Filter *.docx |
Foreach-Object {
$current_template_name = $_.FullName
$word, $Doc = OpenWordDoc -Filename $current_template_name
SearchAWord –Document $Doc -findText '*Date*' -replaceWithText $Date
SaveAsWordDoc -word $word -Document $Doc -FileName $current_template_name
}
如您所见,仅从def almost_equals(a, b, decimal=6):
try:
np.testing.assert_almost_equal(a, b, decimal=decimal)
except AssertionError:
return False
return True
image = [abs(np.random.normal(size=(32, 32, 3))) for _ in range(20)]
label = [[0, 1] if i % 3 == 0 else [1, 0] for i in range(20)]
img = tf.convert_to_tensor(image, dtype=tf.float64)
support_class = tf.convert_to_tensor(label, dtype=tf.float64)
loss_fn = tf.losses.CategoricalCrossentropy()
with tf.GradientTape(persistent=True) as tape:
tape.watch(img)
softmaxed = model(img)
loss = loss_fn(support_class, softmaxed)
grads = tape.gradient(loss, img, unconnected_gradients=tf.UnconnectedGradients.NONE)
# summing up all gradients with reduction over all dimension:
print(tf.reduce_sum(grads, axis=None).numpy()) # 0.07137820225818814
# comparing to zeros:
zeros_like_grads = np.zeros_like(grads.numpy())
for decimal in range(10, 0, -1):
print('decimal: {0}: {1}'.format(decimal,
almost_equals(zeros_like_grads,
grads.numpy(),
decimal=decimal)))
# decimal: 10: False
# decimal: 9: False
# decimal: 8: False
# decimal: 7: False
# decimal: 6: False
# decimal: 5: False
# decimal: 4: False
# decimal: 3: True
# decimal: 2: True
# decimal: 1: True
开始,它才开始返回decimal=3
。
答案 1 :(得分:1)
因此,事实证明网络没有问题。问题与我在最后一个Dense
层中使用的softmax激活的行为有关。我没有考虑过来自softmax的非常自信的预测(例如,我的预测之一[[1.0000000e + 00 1.9507678e-25]])会使梯度为零(理论上非常接近零,但实际上为零)。讨论此问题以及如何应对的有用线程:https://github.com/keras-team/keras/issues/5881
我的解决方案:当我想计算输入图像的梯度时,请关闭softmax激活