自定义卷积层的渐变为无

时间:2019-11-21 21:52:47

标签: tensorflow conv-neural-network gradient

我已经实现了带有自定义卷积层的Basic MNIST模型,如下所示。问题在于,自定义层的渐变始终为“无” ,因此由于“渐变”具有“无”值,因此在反向传播期间不会进行学习。 我已经在向前传递过程中调试了层的输出,它们还可以。 这是示例代码,为简单起见,我传递了“ Ones”图像,并刚刚从自定义层返回了矩阵。 我已尽力而为,但可以使它奏效,非常感谢您提前 以下代码是可执行的,并引发

警告 :tensorflow:变量['cnn / custom_conv2d / kernel:0','cnn / custom_conv2d / bias:0','cnn / custom_conv2d_1 / kernel:0','cnn / custom_conv2d_1 / bias:0',最大限度地减少损失时,'cnn / custom_conv2d_2 / kernel:0','cnn / custom_conv2d_2 / bias:0']

import numpy as np
import tensorflow as tf
from grpc.beta import interfaces
class CustomConv2D(tf.keras.layers.Conv2D):
    def __init__(self, filters,
                 kernel_size,
                 strides=(1, 1),
                 padding='valid',
                 data_format=None,
                 dilation_rate=(1, 1),
                 activation=None,
                 use_bias=True,
                 kernel_initializer='glorot_uniform',
                 bias_initializer='glorot_uniform',
                 kernel_regularizer=None,
                 bias_regularizer=None,
                 activity_regularizer=None,
                 kernel_constraint=None,
                 bias_constraint=None,
                 __name__ = 'CustomConv2D',
                 **kwargs
                 ):
        super(CustomConv2D, self).__init__(
            filters=filters,
            kernel_size=kernel_size,
            strides=strides,
            padding=padding,
            data_format=data_format,
            dilation_rate=dilation_rate,
            activation=activation,
            use_bias=use_bias,
            kernel_initializer=kernel_initializer,
            bias_initializer=bias_initializer,
            kernel_regularizer=kernel_regularizer,
            bias_regularizer=bias_regularizer,
            activity_regularizer=activity_regularizer,
            kernel_constraint=kernel_constraint,
            bias_constraint=bias_constraint,
            **kwargs )

    def call(self,matrix):
        return self.customConv(matrix)

    def customConv(self,matrix):
        return matrix #dummy conv result
        #returns custom convolution result

class CNN(tf.keras.Model):
  def __init__(self):
    super(CNN, self).__init__()
    self.learning_rate = 0.001
    self.momentum = 0.9
    self.optimizer = tf.keras.optimizers.Adam(self.learning_rate, self.momentum)
    self.conv1 = CustomConv2D(filters = 6, kernel_size= 3, activation = 'relu')  ## valid means no padding
    self.pool1 = tf.keras.layers.MaxPool2D(pool_size=2) # default stride??-
    self.conv2 = CustomConv2D(filters = 16, kernel_size = 3,  activation = 'relu')
    self.pool2 = tf.keras.layers.MaxPool2D(pool_size = 2)
    self.conv3 = CustomConv2D(filters=120, kernel_size=3,  activation='relu')
    self.flatten = tf.keras.layers.Flatten()
    self.fc1 = tf.keras.layers.Dense(units=82,kernel_initializer='glorot_uniform')
    self.fc2 = tf.keras.layers.Dense(units=10, activation = 'softmax',kernel_initializer='glorot_uniform')
  def call(self, x):
      x = self.conv1(x)  # shap(32,26,26,6) all (6s 3s 6s 3s)
      x = self.pool1(x)  # shap(32,13,13,6) all (6s)
      x = self.conv2(x)  # shap(32,11,11,16) all(324s)
      x = self.pool2(x)  # shap(32,5,5,16)
      x = self.conv3(x)  # shap(32,3,3,120)all(46656)
      x = self.flatten(x)  # shap(32,1080)
      x = self.fc1(x)  # shap(32,82)
      x = self.fc2(x)  # shap(32,10)
      return x
  def feedForward(self, image, label):
            accuracy_object = tf.metrics.Accuracy()
            loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
            with tf.GradientTape() as tape:
                feedForwardCompuation = self(image, training=True)
                self.loss_value = loss_object(label, feedForwardCompuation)
            grads = tape.gradient(self.loss_value, self.variables)
            self.optimizer.apply_gradients(zip(grads, self.variables))
            accuracy = accuracy_object(tf.argmax(feedForwardCompuation, axis=1, output_type=tf.int32), label)

image=np.ones((28,28)).reshape((1,28,28,1)) #dummy image
label=1
cnn=CNN()
cnn.feedForward(image,label)

Snapshot of grads while debugging

1 个答案:

答案 0 :(得分:0)

问题是Convolution在类CustomConv2D中没有发生。 call方法和customConv方法都没有执行卷积运算,但是它只是按原样返回Input

  1. return self.customConv(matrix)类的call方法中的CustomConv2D的行return super(tf.keras.layers.Conv2D, self).call(matrix)替换为call将执行实际的卷积运算。
  2. 另一项更改是通过在行CNN之前加入行_ = cnn(X_reshaped),来调用cnn.feedForward(image,label)类的cordova.cmd build android Checking Java JDK and Android SDK versions ANDROID_SDK_ROOT=undefined (recommended setting) ANDROID_HOME=C:\Users\comit\AppData\Local\Android\Sdk\ (DEPRECATED) Could not find an installed version of Gradle either in Android Studio, or on your system to install the gradle wrapper. Please include gradle in your path, or install Android Studio [ERROR] An error occurred while running subprocess cordova 方法

通过上述两项更改,将添加渐变。