Question

此问题也存在问题：https://github.com/fchollet/keras/issues/4266

我正在尝试实施Convolutional - LSTM。它是一个循环层，接受图像作为输入，并使用卷积计算LSTM中的各种门。所以我试图继承Recurrent并更改输入维度。

为了做到这一点，我阅读了writing a custom layer上的文档，并按照建议阅读源代码，以了解幕后发生的事情。

我阅读了recurrent.py的代码并认为结构是明确的：您继承自Recurrent但不覆盖调用，而是提供自定义step函数和{{1将负责将步骤应用于序列中的每个条目。

作为起点，我采用了GRU的代码并尝试根据我的需要进行调整。我想结合2D卷积和GRU（通常它是一个LSTM，但这并不重要 - 我决定实现一个C-GRU）

这个想法是在模型中进行通常的2D卷积，输出3个特征。这3个特征将用作GRU中的r，z和h激活。在自定义层中，我只需要跟踪状态。我的图层甚至没有可训练的权重，它们包含在卷积中。

原始Recurrent代码的显着变化是：

GRU

正如您所看到的，我只是在重复使用卷积中的特征。乘法应该按元素执行。我将调试它以确保它具有预期的行为。

由于状态变为2D，我也在改变def step(self, x, states): # the previous state is a 2D vector h_tm1 = states[0] # previous memory z=self.inner_activation(x[:,0,:,:]) r=self.inner_activation(x[:,1,:,:]) hh=self.activation(x[:,2,:,:]) h = z * h_tm1 + (1 - z) * hh return h, [h]：

initial_state

def get_initial_states(self, x): initial_state=K.zeros_like(x) # (samples, timesteps, input_dim) # input_dim = (3, x_dim, y_dim) initial_state=K.sum(initial_state, axis=(1,2)) # (samples, x_dim, y_dim) return initial_state似乎是针对Recurrent网络的硬编码。我压倒了它：

output_shape

硬编码的另一件事是def get_output_shape_for(self, input_shape): #TODO: this is hardcoding for th layout return (input_shape[0],1,input_shape[2],input_shape[3])。在构造函数中，在调用super之后，我用输入维度覆盖它：

input_spec

还有其他一些小变化。您可以在此处找到完整的代码：http://pastebin.com/60ztPis3

运行时，会产生以下错误消息：

theano.tensor.var.AsTensorError：（'无法将[无]转换为TensorType'，）

pastebin上的整个错误消息：http://pastebin.com/Cdmr20Yn

我正在尝试调试代码。但这很难，它深入到了Keras源代码中。一件事：执行永远不会达到我的自定义class CGRU(Recurrent): def __init__(self, init='glorot_uniform', inner_init='orthogonal', activation='tanh', inner_activation='hard_sigmoid', **kwargs): self.init = initializations.get(init) self.inner_init = initializations.get(inner_init) self.activation = activations.get(activation) self.inner_activation = activations.get(inner_activation) #removing the regularizers and the dropout super(CGRU, self).__init__(**kwargs) # this seems necessary in order to accept 5 input dimensions # (samples, timesteps, features, x, y) self.input_spec=[InputSpec(ndim=5)]功能。显然配置中的某些东西出了问题。在step call函数中，input_shape是一个包含条目Recurrent

的元组

这是对的。我的序列有40个元素。每一个都是具有1个特征和40x40分辨率的图像。我正在使用“th”布局。

以下是(None, 40,1,40,40)的{{1}}功能。我的代码到达call的调用，设置对我来说很好。 Input_spec似乎是正确的。但在Recurrent期间，它崩溃了。没有达到我的阶梯功能。

K.rnn

此时我迷路了。在我看来，我错过了部分配置。

更新

嗯，现在我有一个奇怪的问题：我的代码现在是：

K.rnn

在以Theano作为后端的计算机上，这有效。模型摘要是：

def call(self, x, mask=None):
    # input shape: (nb_samples, time (padded with zeros), input_dim)
    # note that the .build() method of subclasses MUST define
    # self.input_spec with a complete input shape.
    input_shape = self.input_spec[0].shape
    if self.stateful:
        initial_states = self.states
    else:
        initial_states = self.get_initial_states(x)
    constants = self.get_constants(x)
    preprocessed_input = self.preprocess_input(x)

    last_output, outputs, states = K.rnn(self.step, preprocessed_input,
                                         initial_states,
                                         go_backwards=self.go_backwards,
                                         mask=mask,
                                         constants=constants,
                                         unroll=self.unroll,
                                         input_length=input_shape[1])

但是在具有tensorflow作为后端的计算机上，代码失败了。我为# this is the actual input, fed to the network inputs = Input((1, 40, 40, 40)) # now reshape to a sequence reshaped = Reshape((40, 1, 40, 40))(inputs) conv_inputs = Input((1, 40, 40)) conv1 = Convolution2D(3, 3, 3, activation='relu', border_mode='same')(conv_inputs) convmodel = Model(input=conv_inputs, output=conv1) convmodel.summary() #apply the segmentation to each layer time_dist=TimeDistributed(convmodel)(reshaped) from cgru import CGRU up=CGRU(go_backwards=False, return_sequences=True, name="up") up=up(time_dist) output=Reshape([1,40,40,40])(up) model=Model(input=inputs, output=output) print(model.summary())添加了________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ==================================================================================================== input_1 (InputLayer) (None, 1, 40, 40, 40) 0 ____________________________________________________________________________________________________ reshape_1 (Reshape) (None, 40, 1, 40, 40) 0 input_1[0][0] ____________________________________________________________________________________________________ timedistributed_1 (TimeDistribute(None, 40, 3, 40, 40) 30 reshape_1[0][0] ____________________________________________________________________________________________________ up (CGRU) (None, 40, 1, 40, 40) 0 timedistributed_1[0][0] ____________________________________________________________________________________________________ reshape_2 (Reshape) (None, 1, 40, 40, 40) 0 up[0][0] ==================================================================================================== Total params: 30 ____________________________________________________________________________________________________。直到它起作用：

model.summary()

然后程序崩溃了：

ValueError：形状（？，？，40,40）和（40，？，40）不兼容

看起来Theano和Tensorflow对batch_size有不同的（和不兼容的）占位符？请注意，我在两种情况下都将Keras配置为使用“th”图像布局。

Answer 1

我认为问题已经解决了。 initial_states需要一个列表，output_dimension必须修复。现在它似乎工作。底层后端存在一些其他问题（例如Theano vs Tensorflow），但这似乎是这个问题的偏离。

一旦我确定问题已经解决并且图层能够学习，我将使用所有必要步骤更新此答案。

Keras写一个接受图像的Recurrent Layer

1 个答案: