Question

我正在尝试在Keras中建立一个卷积神经网络，该网络在中间结合一个卷积LSTM层（ConvLSTM2D），以处理来自视频的一系列GREYSCALE图像。每个帧的形状为（61,61,1），这些序列的序列一起传递，因此总输入为（num_movies，num_frames，frame_height，frame_width，1）。卷积层经过预训练以按顺序自动编码图像，因此剩下的就是训练循环LSTM层。当我只有一个过滤器时，网络工作正常。这是网络（LSTM层以外的所有层都是TimeDistributed）。网络的目标是在给定当前帧的情况下预测视频中的未来帧（即按顺序通过，将序列向前移动n帧）。我分别训练了卷积部分（自动编码器；请参见下面的代码）以自动编码各个帧，现在尝试添加LSTM来进行序列预测。

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_1 (ZeroPad) (None, 34, 64, 64, 1)     0         
_________________________________________________________________
time_distributed_2 (Conv2D) (None, 34, 64, 64, 16)    160       
_________________________________________________________________
time_distributed_3 (MaxPool) (None, 34, 32, 32, 16)    0         
_________________________________________________________________
time_distributed_4 (Conv2D) (None, 34, 32, 32, 8)     1160      
_________________________________________________________________
time_distributed_5 (MaxPool) (None, 34, 16, 16, 8)     0         
_________________________________________________________________
time_distributed_6 (Conv2D) (None, 34, 16, 16, 8)     584       
_________________________________________________________________
time_distributed_7 (MaxPool) (None, 34, 8, 8, 8)       0         
_________________________________________________________________
time_distributed_8 (Conv2D)  (None, 34, 8, 8, 8)       584       
_________________________________________________________________
time_distributed_9 (MaxPool)  (None, 34, 4, 4, 8)       0         
_________________________________________________________________
time_distributed_10 (Conv2D) (None, 34, 4, 4, 1)       73        
_________________________________________________________________
rnn (ConvLSTM2D)             (None, 34, 4, 4, 1)       36        
_________________________________________________________________
time_distributed_11 (Conv2D) (None, 34, 4, 4, 4)       40        
_________________________________________________________________
time_distributed_12 (UpSample) (None, 34, 8, 8, 4)       0         
_________________________________________________________________
time_distributed_13 (Conv2D) (None, 34, 8, 8, 8)       296       
_________________________________________________________________
time_distributed_14 (UpSample) (None, 34, 16, 16, 8)     0         
_________________________________________________________________
time_distributed_15 (Conv2D) (None, 34, 16, 16, 8)     584       
_________________________________________________________________
time_distributed_16 (UpSample) (None, 34, 32, 32, 8)     0         
_________________________________________________________________
time_distributed_17 (Conv2D) (None, 34, 32, 32, 16)    1168      
_________________________________________________________________
time_distributed_18 (UpSample) (None, 34, 64, 64, 16)    0         
_________________________________________________________________
time_distributed_19 (Conv2D) (None, 34, 64, 64, 1)     145       
_________________________________________________________________
time_distributed_20 (UpSample) (None, 34, 61, 61, 1)     0         
=================================================================
Total params: 4,830
Trainable params: 36
Non-trainable params: 4,794
_________________________________________________________________

当ConvLSTM2D层中的过滤器数为1时，一切都会运行。但是，当我尝试调整ConvLSTM2D层中的过滤器数量时，出现错误：

“ ValueError：输入通道数与过滤器的相应尺寸不匹配，7！= 1”

其中7是我要使用的过滤器数量。

我在这里构建自动编码器。

    autoencoder = Sequential()

    autoencoder.add(ZeroPadding2D(((2,1),(2,1)), input_shape = image_shape))  # (61,61, 1) --> (64,64, 1)

    autoencoder.add(Conv2D(16, (3,3), activation='relu', padding='same'))         # (64, 64, 1) --> (64, 64, 16) 
    autoencoder.add(MaxPooling2D((2,2), padding='same'))                          # (64, 64, 16) --> (32, 32, 16)
    autoencoder.add(Conv2D(8, (3,3), activation = 'relu', padding = 'same'))      # (32,32, 16) --> (32,32, 8)
    autoencoder.add(MaxPooling2D((2,2), padding = 'same'))                        # (32, 32, 8) --> (16,16, 8)
    autoencoder.add(Conv2D(8, (3,3), activation = 'relu', padding = 'same'))      # (16,16,8) --> (16,16,4)
    autoencoder.add(MaxPooling2D((2,2), padding = 'same'))                        # (16, 16, 4) --> (4, 4, 4)
    autoencoder.add(Conv2D(8, (3,3), activation = 'relu', padding = 'same'))      # (16,16,8) --> (16,16,4)
    autoencoder.add(MaxPooling2D((2,2), padding = 'same'))                        # (16, 16, 4) --> (4, 4, 4)
    autoencoder.add(Conv2D(1, (3,3), activation = 'relu', padding = 'same'))      # (4,4,4) --> (4,4,1)


    # decode the lower dimensional representation back into an image
    model.add(Conv2D(4,(3,3), activation = 'relu', padding = 'same')) # (4,4,1) --> (4,4,4)
    autoencoder.add(UpSampling2D((2,2)))                                    # (4,4,4) -> (16,16,4)
    autoencoder.add(Conv2D(8,(3,3), activation = 'relu', padding = 'same')) # (4,4,1) --> (4,4,4)
    autoencoder.add(UpSampling2D((2,2)))                                    # (4,4,4) -> (16,16,4)
    autoencoder.add(Conv2D(8,(3,3), activation = 'relu', padding = 'same')) # (16,16,4) --> (16,16,8)
    autoencoder.add(UpSampling2D((2,2)))                                    # (16,16,8) -> (32,32,8)
    autoencoder.add(Conv2D(16, (3,3), activation = 'relu', padding='same'))# (32,32,8) -> (32,32,16)
    autoencoder.add(UpSampling2D((2,2)))                                    # (32,32,16) -> (64,64,16)
    # want to use sigmoid as our final activation function to make the output more
    autoencoder.add(Conv2D(1, (3,3), activation='sigmoid', padding='same')) # (64,64,16) -> (64,64, 1)

    autoencoder.add(Cropping2D(((2,1),(2,1)))) # (64,64,1) -> (61,61, 1)

一旦构建了自动编码器，我就会在自动编码任务上对其进行训练，并添加LSTM层。


      num_layers = len(autoencoder.layers)

      model = Sequential()
      for i in range(num_layers // 2): 
          model.add(TimeDistributed(autoencoder.layers[i]))


      out_shape = autoencoder.layers[num_layers//2 - 1].output_shape

     # Convolutional LSTM 
     num_filters = 7 
     kernel_shape = (2,2)
     model.add(ConvLSTM2D(filters=num_filters, 
                         kernel_size=kernel_shape,
                         activation='tanh',
                         padding='same',
                         return_sequences = True,
                         name='rnn'))


     ''' #SimpleRNN
      model.add(TimeDistributed(Flatten()))
     model.add(SimpleRNN(rnn_size, 
                   return_sequences = True,
                   activation = 'tanh',
                    name='rnn'))
     # NOTE: since the RNN changes size of output of the final Conv2D layer in encoding section, we somehow have
     #       to map the dimension back down. This is what the Dense layer below does
     model.add(TimeDistributed(Dense(out_shape[1] * out_shape[2], activation = 'relu', name = 'ff')))
     model.add(TimeDistributed(Reshape((out_shape[1], out_shape[2], 1))))
     '''

      for i in range(num_layers//2, num_layers):
          model.add(TimeDistributed(autoencoder.layers[i]))

      # set non-reccurent layers to untrainable; we already trained these to be autoencoders, so the RNN
      # just has to learn how to move the object in the low dimensional space
      for layer in model.layers:
~         if not (layer.name == 'rnn' or layer.name == 'ff'):
              layer.trainable = False

当我将过滤器的数量更改为除一个以外的其他数量时，我立即收到此错误：

ValueError：输入通道数与过滤器的相应尺寸不匹配，7！= 1

我不明白为什么滤波器的数量必须与输入通道的数量绑定在一起？我们不能在同一输入上有多个过滤器，每个过滤器具有不同的内核吗？

我尝试了一些常见的修复方法，例如设置'data_format = channels_last'

Keras ConvLSTM2D过滤器和输入通道：ValueError：输入通道数与相应的过滤器尺寸不匹配，7！= 1

0 个答案: