Keras ConvLSTM2D过滤器和输入通道:ValueError:输入通道数与相应的过滤器尺寸不匹配,7!= 1

时间:2019-05-28 20:52:57

标签: keras conv-neural-network

我正在尝试在Keras中建立一个卷积神经网络,该网络在中间结合一个卷积LSTM层(ConvLSTM2D),以处理来自视频的一系列GREYSCALE图像。每个帧的形状为(61,61,1),这些序列的序列一起传递,因此总输入为(num_movies,num_frames,frame_height,frame_width,1)。卷积层经过预训练以按顺序自动编码图像,因此剩下的就是训练循环LSTM层。当我只有一个过滤器时,网络工作正常。这是网络(LSTM层以外的所有层都是TimeDistributed)。网络的目标是在给定当前帧的情况下预测视频中的未来帧(即按顺序通过,将序列向前移动n帧)。我分别训练了卷积部分(自动编码器;请参见下面的代码)以自动编码各个帧,现在尝试添加LSTM来进行序列预测。

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
time_distributed_1 (ZeroPad) (None, 34, 64, 64, 1)     0         
_________________________________________________________________
time_distributed_2 (Conv2D) (None, 34, 64, 64, 16)    160       
_________________________________________________________________
time_distributed_3 (MaxPool) (None, 34, 32, 32, 16)    0         
_________________________________________________________________
time_distributed_4 (Conv2D) (None, 34, 32, 32, 8)     1160      
_________________________________________________________________
time_distributed_5 (MaxPool) (None, 34, 16, 16, 8)     0         
_________________________________________________________________
time_distributed_6 (Conv2D) (None, 34, 16, 16, 8)     584       
_________________________________________________________________
time_distributed_7 (MaxPool) (None, 34, 8, 8, 8)       0         
_________________________________________________________________
time_distributed_8 (Conv2D)  (None, 34, 8, 8, 8)       584       
_________________________________________________________________
time_distributed_9 (MaxPool)  (None, 34, 4, 4, 8)       0         
_________________________________________________________________
time_distributed_10 (Conv2D) (None, 34, 4, 4, 1)       73        
_________________________________________________________________
rnn (ConvLSTM2D)             (None, 34, 4, 4, 1)       36        
_________________________________________________________________
time_distributed_11 (Conv2D) (None, 34, 4, 4, 4)       40        
_________________________________________________________________
time_distributed_12 (UpSample) (None, 34, 8, 8, 4)       0         
_________________________________________________________________
time_distributed_13 (Conv2D) (None, 34, 8, 8, 8)       296       
_________________________________________________________________
time_distributed_14 (UpSample) (None, 34, 16, 16, 8)     0         
_________________________________________________________________
time_distributed_15 (Conv2D) (None, 34, 16, 16, 8)     584       
_________________________________________________________________
time_distributed_16 (UpSample) (None, 34, 32, 32, 8)     0         
_________________________________________________________________
time_distributed_17 (Conv2D) (None, 34, 32, 32, 16)    1168      
_________________________________________________________________
time_distributed_18 (UpSample) (None, 34, 64, 64, 16)    0         
_________________________________________________________________
time_distributed_19 (Conv2D) (None, 34, 64, 64, 1)     145       
_________________________________________________________________
time_distributed_20 (UpSample) (None, 34, 61, 61, 1)     0         
=================================================================
Total params: 4,830
Trainable params: 36
Non-trainable params: 4,794
_________________________________________________________________

当ConvLSTM2D层中的过滤器数为1时,一切都会运行。但是,当我尝试调整ConvLSTM2D层中的过滤器数量时,出现错误:

“ ValueError:输入通道数与过滤器的相应尺寸不匹配,7!= 1”

其中7是我要使用的过滤器数量。

我在这里构建自动编码器。

    autoencoder = Sequential()

    autoencoder.add(ZeroPadding2D(((2,1),(2,1)), input_shape = image_shape))  # (61,61, 1) --> (64,64, 1)

    autoencoder.add(Conv2D(16, (3,3), activation='relu', padding='same'))         # (64, 64, 1) --> (64, 64, 16) 
    autoencoder.add(MaxPooling2D((2,2), padding='same'))                          # (64, 64, 16) --> (32, 32, 16)
    autoencoder.add(Conv2D(8, (3,3), activation = 'relu', padding = 'same'))      # (32,32, 16) --> (32,32, 8)
    autoencoder.add(MaxPooling2D((2,2), padding = 'same'))                        # (32, 32, 8) --> (16,16, 8)
    autoencoder.add(Conv2D(8, (3,3), activation = 'relu', padding = 'same'))      # (16,16,8) --> (16,16,4)
    autoencoder.add(MaxPooling2D((2,2), padding = 'same'))                        # (16, 16, 4) --> (4, 4, 4)
    autoencoder.add(Conv2D(8, (3,3), activation = 'relu', padding = 'same'))      # (16,16,8) --> (16,16,4)
    autoencoder.add(MaxPooling2D((2,2), padding = 'same'))                        # (16, 16, 4) --> (4, 4, 4)
    autoencoder.add(Conv2D(1, (3,3), activation = 'relu', padding = 'same'))      # (4,4,4) --> (4,4,1)


    # decode the lower dimensional representation back into an image
    model.add(Conv2D(4,(3,3), activation = 'relu', padding = 'same')) # (4,4,1) --> (4,4,4)
    autoencoder.add(UpSampling2D((2,2)))                                    # (4,4,4) -> (16,16,4)
    autoencoder.add(Conv2D(8,(3,3), activation = 'relu', padding = 'same')) # (4,4,1) --> (4,4,4)
    autoencoder.add(UpSampling2D((2,2)))                                    # (4,4,4) -> (16,16,4)
    autoencoder.add(Conv2D(8,(3,3), activation = 'relu', padding = 'same')) # (16,16,4) --> (16,16,8)
    autoencoder.add(UpSampling2D((2,2)))                                    # (16,16,8) -> (32,32,8)
    autoencoder.add(Conv2D(16, (3,3), activation = 'relu', padding='same'))# (32,32,8) -> (32,32,16)
    autoencoder.add(UpSampling2D((2,2)))                                    # (32,32,16) -> (64,64,16)
    # want to use sigmoid as our final activation function to make the output more
    autoencoder.add(Conv2D(1, (3,3), activation='sigmoid', padding='same')) # (64,64,16) -> (64,64, 1)

    autoencoder.add(Cropping2D(((2,1),(2,1)))) # (64,64,1) -> (61,61, 1)

一旦构建了自动编码器,我就会在自动编码任务上对其进行训练,并添加LSTM层。


      num_layers = len(autoencoder.layers)

      model = Sequential()
      for i in range(num_layers // 2): 
          model.add(TimeDistributed(autoencoder.layers[i]))


      out_shape = autoencoder.layers[num_layers//2 - 1].output_shape

     # Convolutional LSTM 
     num_filters = 7 
     kernel_shape = (2,2)
     model.add(ConvLSTM2D(filters=num_filters, 
                         kernel_size=kernel_shape,
                         activation='tanh',
                         padding='same',
                         return_sequences = True,
                         name='rnn'))


     ''' #SimpleRNN
      model.add(TimeDistributed(Flatten()))
     model.add(SimpleRNN(rnn_size, 
                   return_sequences = True,
                   activation = 'tanh',
                    name='rnn'))
     # NOTE: since the RNN changes size of output of the final Conv2D layer in encoding section, we somehow have
     #       to map the dimension back down. This is what the Dense layer below does
     model.add(TimeDistributed(Dense(out_shape[1] * out_shape[2], activation = 'relu', name = 'ff')))
     model.add(TimeDistributed(Reshape((out_shape[1], out_shape[2], 1))))
     '''

      for i in range(num_layers//2, num_layers):
          model.add(TimeDistributed(autoencoder.layers[i]))

      # set non-reccurent layers to untrainable; we already trained these to be autoencoders, so the RNN
      # just has to learn how to move the object in the low dimensional space
      for layer in model.layers:
~         if not (layer.name == 'rnn' or layer.name == 'ff'):
              layer.trainable = False


当我将过滤器的数量更改为除一个以外的其他数量时,我立即收到此错误:

ValueError:输入通道数与过滤器的相应尺寸不匹配,7!= 1

我不明白为什么滤波器的数量必须与输入通道的数量绑定在一起?我们不能在同一输入上有多个过滤器,每个过滤器具有不同的内核吗?

我尝试了一些常见的修复方法,例如设置'data_format = channels_last'

0 个答案:

没有答案