如何在Keras的Conv2D中操纵(约束)滤波器内核的权重?

时间:2018-05-24 15:52:45

标签: python tensorflow filter keras constraints

据我所知,Keras中的Conv2D中有几个kernel_constraint选项:max_norm,non_neg或unit_norm ..

但我需要的是将滤波器内核中的锚点(中心)位置设置为零。 例如,如果我们有一个大小为(width,height)=(5,5)的滤波器内核,并且输入中有3个通道。我需要将每个通道的内核的锚(中心)点约束为0,如w(2,2,:) = 0,假设我们将通道尺寸设为第3维。如果有多个过滤器,则每个过滤器的锚位置应为零。我怎么能实现这个呢?

我假设需要一个自定义内核约束。此链接提供了如何创建一个继承自Constraint的类的建议:https://github.com/keras-team/keras/issues/8196。这显示了如何实现内置约束: https://github.com/keras-team/keras/blob/master/keras/constraints.py

但是,我仍然不知道如何操纵w的尺寸,以及如何将所需位置设置为零。任何帮助表示赞赏。感谢。

更新: DanielMöller的回答经过了尝试。错误信息如下:
提高ValueError('一个操作对渐变有None。 ValueError:对于渐变,操作具有None。请确保您的所有操作都定义了渐变(即可区分)。没有渐变的常见操作:K.argmax,K.round,K.eval。

由于丹尼尔可以在没有问题的情况下运行,为了检查我的程序中出了什么问题,我在这里发布我的简化代码。我的数据有8个频道,但不管你有多少频道都没关系。

from keras.layers import Input, Conv2D
from keras.models import Model, optimizers
import numpy as np
import tensorflow as tf
from keras import backend as K
from keras.callbacks import ModelCheckpoint


class ZeroCenterConv2D(Conv2D):
    def __init__(self, filters, kernel_size, **kwargs):
        super(ZeroCenterConv2D, self).__init__(filters, kernel_size, **kwargs)

    def call(self, inputs):
        assert self.kernel_size[0] % 2 == 1, "Error: the kernel size is an even number"
        assert self.kernel_size[1] % 2 == 1, "Error: the kernel size is an even number"

        centerX = (self.kernel_size[0] - 1) // 2
        centerY = (self.kernel_size[1] - 1) // 2

        kernel_mask = np.ones(self.kernel_size + (1, 1))
        kernel_mask[centerX, centerY] = 0
        kernel_mask = K.constant(kernel_mask)

        customKernel = self.kernel * kernel_mask

        outputs = K.conv2d(
            inputs,
            customKernel,
            strides=self.strides,
            padding=self.padding,
            data_format=self.data_format,
            dilation_rate=self.dilation_rate)

        if self.activation is not None:
            return self.activation(outputs)

        return outputs


size1 = 256
size2 = 256
input_img = Input(shape=(size1, size2, 8))
conv1 = ZeroCenterConv2D(8, (5, 5), padding='same', activation='relu')(input_img)
autoencoder = Model(input_img, conv1)
adam = optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-8)
autoencoder.compile(optimizer=adam, loss='mean_squared_error')


import scipy.io
A = scipy.io.loadmat('data_train')
x_train = A['data']
x_train = np.reshape(x_train, (1, 256, 256, 8))


from keras.callbacks import TensorBoard

autoencoder.fit(x_train, x_train,
                epochs=5,
                batch_size=1,
                shuffle=False,
                validation_data=(x_train, x_train),
                callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])


decoded_imgs = autoencoder.predict(x_train)

当conv1 = ZeroCenterConv2D ...被传统的conv1 = Conv2D取代......时,一切正常。

完整的错误消息:

Connected to pydev debugger (build 181.4668.75)
/home/allen/kerasProject/keras/venv/py2.7/local/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Traceback (most recent call last):
  File "/snap/pycharm-community/60/helpers/pydev/pydevd.py", line 1664, in <module>
    main()
  File "/snap/pycharm-community/60/helpers/pydev/pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/snap/pycharm-community/60/helpers/pydev/pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/allen/autotion/temptest", line 62, in <module>
    callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])
  File "/home/allen/kerasProject/keras/venv/py2.7/local/lib/python2.7/site-packages/keras/engine/training.py", line 1682, in fit
    self._make_train_function()
  File "/home/allen/kerasProject/keras/venv/py2.7/local/lib/python2.7/site-packages/keras/engine/training.py", line 992, in _make_train_function
    loss=self.total_loss)
  File "/home/allen/kerasProject/keras/venv/py2.7/local/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/allen/kerasProject/keras/venv/py2.7/local/lib/python2.7/site-packages/keras/optimizers.py", line 445, in get_updates
    grads = self.get_gradients(loss, params)
  File "/home/allen/kerasProject/keras/venv/py2.7/local/lib/python2.7/site-packages/keras/optimizers.py", line 80, in get_gradients
    raise ValueError('An operation has `None` for gradient. '
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Process finished with exit code 1

进一步更新:在Daniel的答案(已经完成)中的代码中添加“偏见”部分,问题解决了!

1 个答案:

答案 0 :(得分:0)

您需要一个自定义的Conv2D图层,您可以在其中更改其调用方法以在中心应用零。

class ZeroCenterConv2D(Conv2D):
    def __init__(self, filters, kernel_size, **kwargs):
        super(ZeroCenterConv2D, self).__init__(filters, kernel_size, **kwargs)

    def call(self, inputs):
        assert self.kernel_size[0] % 2 == 1, "Error: the kernel size is an even number"
        assert self.kernel_size[1] % 2 == 1, "Error: the kernel size is an even number"

        centerX = (self.kernel_size[0] - 1) // 2
        centerY = (self.kernel_size[1] - 1) // 2

        kernel_mask = np.ones(self.kernel_size + (1, 1))
        kernel_mask[centerX, centerY] = 0
        kernel_mask = K.variable(kernel_mask)

        customKernel = self.kernel * kernel_mask

        outputs = K.conv2d(
            inputs,
            customKernel,
            strides=self.strides,
            padding=self.padding,
            data_format=self.data_format,
            dilation_rate=self.dilation_rate)

        if self.use_bias:
            outputs = K.bias_add(
                outputs,
                self.bias,
                data_format=self.data_format)

        if self.activation is not None:
            return self.activation(outputs)

        return outputs

但这并不会取代实际的重量,但中心的重量永远不会被使用。

当您使用layer.get_weights()的{​​{1}}时,您会看到初始化时的中心权重(不是零)。