为什么固定网络输入大小会减小模型的文件大小

时间:2019-03-13 14:03:38

标签: tensorflow keras

我有一个模型,可以将其在训练期间的输入大小可变以进行概括。

为了量化,我必须固定输入大小,因此我只需要使用固定的输入大小重新创建模型,并复制所有权重和偏差,然后保存模型。 但是由于某种原因,模型的大小大约变成了四分之一。 请注意,这是在进行量化或其他操作之前,并且参数保持不变。

以下是两个模型摘要:

模型1 = 4.6MB

old_model.summary(line_length=110)


  ______________________________________________________________________________________________________________
Layer (type)                        Output Shape            Param #      Connected to                         
==============================================================================================================
input_1 (InputLayer)                (None, None, None, 4)   0                                                 
______________________________________________________________________________________________________________
gaussian_noise_1 (GaussianNoise)    (None, None, None, 4)   0            input_1[0][0]                        
______________________________________________________________________________________________________________
conv2d_1 (Conv2D)                   (None, None, None, 32)  1184         gaussian_noise_1[0][0]               
______________________________________________________________________________________________________________
batch_normalization_1 (BatchNormali (None, None, None, 32)  128          conv2d_1[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_2 (GaussianNoise)    (None, None, None, 32)  0            batch_normalization_1[0][0]          
______________________________________________________________________________________________________________
conv2d_2 (Conv2D)                   (None, None, None, 32)  9248         gaussian_noise_2[0][0]               
______________________________________________________________________________________________________________
batch_normalization_2 (BatchNormali (None, None, None, 32)  128          conv2d_2[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_3 (GaussianNoise)    (None, None, None, 32)  0            batch_normalization_2[0][0]          
______________________________________________________________________________________________________________
conv2d_3 (Conv2D)                   (None, None, None, 64)  18496        gaussian_noise_3[0][0]               
______________________________________________________________________________________________________________
batch_normalization_3 (BatchNormali (None, None, None, 64)  256          conv2d_3[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_4 (GaussianNoise)    (None, None, None, 64)  0            batch_normalization_3[0][0]          
______________________________________________________________________________________________________________
conv2d_4 (Conv2D)                   (None, None, None, 64)  36928        gaussian_noise_4[0][0]               
______________________________________________________________________________________________________________
batch_normalization_4 (BatchNormali (None, None, None, 64)  256          conv2d_4[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_5 (GaussianNoise)    (None, None, None, 64)  0            batch_normalization_4[0][0]          
______________________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D)      (None, None, None, 64)  0            gaussian_noise_5[0][0]               
______________________________________________________________________________________________________________
concatenate_1 (Concatenate)         (None, None, None, 96)  0            up_sampling2d_1[0][0]                
                                                                         batch_normalization_1[0][0]          
______________________________________________________________________________________________________________
gaussian_noise_6 (GaussianNoise)    (None, None, None, 96)  0            concatenate_1[0][0]                  
______________________________________________________________________________________________________________
conv2d_5 (Conv2D)                   (None, None, None, 64)  55360        gaussian_noise_6[0][0]               
______________________________________________________________________________________________________________
batch_normalization_5 (BatchNormali (None, None, None, 64)  256          conv2d_5[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_7 (GaussianNoise)    (None, None, None, 64)  0            batch_normalization_5[0][0]          
______________________________________________________________________________________________________________
conv2d_6 (Conv2D)                   (None, None, None, 64)  36928        gaussian_noise_7[0][0]               
______________________________________________________________________________________________________________
batch_normalization_6 (BatchNormali (None, None, None, 64)  256          conv2d_6[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_8 (GaussianNoise)    (None, None, None, 64)  0            batch_normalization_6[0][0]          
______________________________________________________________________________________________________________
conv2d_7 (Conv2D)                   (None, None, None, 64)  36928        gaussian_noise_8[0][0]               
______________________________________________________________________________________________________________
batch_normalization_7 (BatchNormali (None, None, None, 64)  256          conv2d_7[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_9 (GaussianNoise)    (None, None, None, 64)  0            batch_normalization_7[0][0]          
______________________________________________________________________________________________________________
conv2d_8 (Conv2D)                   (None, None, None, 64)  36928        gaussian_noise_9[0][0]               
______________________________________________________________________________________________________________
batch_normalization_8 (BatchNormali (None, None, None, 64)  256          conv2d_8[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_10 (GaussianNoise)   (None, None, None, 64)  0            batch_normalization_8[0][0]          
______________________________________________________________________________________________________________
conv2d_9 (Conv2D)                   (None, None, None, 64)  36928        gaussian_noise_10[0][0]              
______________________________________________________________________________________________________________
batch_normalization_9 (BatchNormali (None, None, None, 64)  256          conv2d_9[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_11 (GaussianNoise)   (None, None, None, 64)  0            batch_normalization_9[0][0]          
______________________________________________________________________________________________________________
up_sampling2d_2 (UpSampling2D)      (None, None, None, 64)  0            gaussian_noise_11[0][0]              
______________________________________________________________________________________________________________
input_2 (InputLayer)                (None, None, None, 3)   0                                                 
______________________________________________________________________________________________________________
concatenate_2 (Concatenate)         (None, None, None, 67)  0            up_sampling2d_2[0][0]                
                                                                         input_2[0][0]                        
______________________________________________________________________________________________________________
gaussian_noise_12 (GaussianNoise)   (None, None, None, 67)  0            concatenate_2[0][0]                  
______________________________________________________________________________________________________________
conv2d_10 (Conv2D)                  (None, None, None, 67)  40468        gaussian_noise_12[0][0]              
______________________________________________________________________________________________________________
batch_normalization_10 (BatchNormal (None, None, None, 67)  268          conv2d_10[0][0]                      
______________________________________________________________________________________________________________
gaussian_noise_13 (GaussianNoise)   (None, None, None, 67)  0            batch_normalization_10[0][0]         
______________________________________________________________________________________________________________
conv2d_11 (Conv2D)                  (None, None, None, 67)  40468        gaussian_noise_13[0][0]              
______________________________________________________________________________________________________________
batch_normalization_11 (BatchNormal (None, None, None, 67)  268          conv2d_11[0][0]                      
______________________________________________________________________________________________________________
gaussian_noise_14 (GaussianNoise)   (None, None, None, 67)  0            batch_normalization_11[0][0]         
______________________________________________________________________________________________________________
conv2d_12 (Conv2D)                  (None, None, None, 32)  19328        gaussian_noise_14[0][0]              
______________________________________________________________________________________________________________
batch_normalization_12 (BatchNormal (None, None, None, 32)  128          conv2d_12[0][0]                      
______________________________________________________________________________________________________________
gaussian_noise_15 (GaussianNoise)   (None, None, None, 32)  0            batch_normalization_12[0][0]         
______________________________________________________________________________________________________________
conv2d_13 (Conv2D)                  (None, None, None, 3)   867          gaussian_noise_15[0][0]              
==============================================================================================================
Total params: 372,771
Trainable params: 371,415
Non-trainable params: 1,356

模型2 = 1.6MB

model.summary(line_length=110)
______________________________________________________________________________________________________________
Layer (type)                        Output Shape            Param #      Connected to                         
==============================================================================================================
input_1 (InputLayer)                (None, 368, 256, 4)     0                                                 
______________________________________________________________________________________________________________
gaussian_noise_1 (GaussianNoise)    (None, 368, 256, 4)     0            input_1[0][0]                        
______________________________________________________________________________________________________________
conv2d_1 (Conv2D)                   (None, 368, 256, 32)    1184         gaussian_noise_1[0][0]               
______________________________________________________________________________________________________________
batch_normalization_1 (BatchNormali (None, 368, 256, 32)    128          conv2d_1[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_2 (GaussianNoise)    (None, 368, 256, 32)    0            batch_normalization_1[0][0]          
______________________________________________________________________________________________________________
conv2d_2 (Conv2D)                   (None, 184, 128, 32)    9248         gaussian_noise_2[0][0]               
______________________________________________________________________________________________________________
batch_normalization_2 (BatchNormali (None, 184, 128, 32)    128          conv2d_2[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_3 (GaussianNoise)    (None, 184, 128, 32)    0            batch_normalization_2[0][0]          
______________________________________________________________________________________________________________
conv2d_3 (Conv2D)                   (None, 184, 128, 64)    18496        gaussian_noise_3[0][0]               
______________________________________________________________________________________________________________
batch_normalization_3 (BatchNormali (None, 184, 128, 64)    256          conv2d_3[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_4 (GaussianNoise)    (None, 184, 128, 64)    0            batch_normalization_3[0][0]          
______________________________________________________________________________________________________________
conv2d_4 (Conv2D)                   (None, 184, 128, 64)    36928        gaussian_noise_4[0][0]               
______________________________________________________________________________________________________________
batch_normalization_4 (BatchNormali (None, 184, 128, 64)    256          conv2d_4[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_5 (GaussianNoise)    (None, 184, 128, 64)    0            batch_normalization_4[0][0]          
______________________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D)      (None, 368, 256, 64)    0            gaussian_noise_5[0][0]               
______________________________________________________________________________________________________________
concatenate_1 (Concatenate)         (None, 368, 256, 96)    0            up_sampling2d_1[0][0]                
                                                                         batch_normalization_1[0][0]          
______________________________________________________________________________________________________________
gaussian_noise_6 (GaussianNoise)    (None, 368, 256, 96)    0            concatenate_1[0][0]                  
______________________________________________________________________________________________________________
conv2d_5 (Conv2D)                   (None, 368, 256, 64)    55360        gaussian_noise_6[0][0]               
______________________________________________________________________________________________________________
batch_normalization_5 (BatchNormali (None, 368, 256, 64)    256          conv2d_5[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_7 (GaussianNoise)    (None, 368, 256, 64)    0            batch_normalization_5[0][0]          
______________________________________________________________________________________________________________
conv2d_6 (Conv2D)                   (None, 368, 256, 64)    36928        gaussian_noise_7[0][0]               
______________________________________________________________________________________________________________
batch_normalization_6 (BatchNormali (None, 368, 256, 64)    256          conv2d_6[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_8 (GaussianNoise)    (None, 368, 256, 64)    0            batch_normalization_6[0][0]          
______________________________________________________________________________________________________________
conv2d_7 (Conv2D)                   (None, 368, 256, 64)    36928        gaussian_noise_8[0][0]               
______________________________________________________________________________________________________________
batch_normalization_7 (BatchNormali (None, 368, 256, 64)    256          conv2d_7[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_9 (GaussianNoise)    (None, 368, 256, 64)    0            batch_normalization_7[0][0]          
______________________________________________________________________________________________________________
conv2d_8 (Conv2D)                   (None, 368, 256, 64)    36928        gaussian_noise_9[0][0]               
______________________________________________________________________________________________________________
batch_normalization_8 (BatchNormali (None, 368, 256, 64)    256          conv2d_8[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_10 (GaussianNoise)   (None, 368, 256, 64)    0            batch_normalization_8[0][0]          
______________________________________________________________________________________________________________
conv2d_9 (Conv2D)                   (None, 368, 256, 64)    36928        gaussian_noise_10[0][0]              
______________________________________________________________________________________________________________
batch_normalization_9 (BatchNormali (None, 368, 256, 64)    256          conv2d_9[0][0]                       
______________________________________________________________________________________________________________
gaussian_noise_11 (GaussianNoise)   (None, 368, 256, 64)    0            batch_normalization_9[0][0]          
______________________________________________________________________________________________________________
up_sampling2d_2 (UpSampling2D)      (None, 736, 512, 64)    0            gaussian_noise_11[0][0]              
______________________________________________________________________________________________________________
input_2 (InputLayer)                (None, 736, 512, 3)     0                                                 
______________________________________________________________________________________________________________
concatenate_2 (Concatenate)         (None, 736, 512, 67)    0            up_sampling2d_2[0][0]                
                                                                         input_2[0][0]                        
______________________________________________________________________________________________________________
gaussian_noise_12 (GaussianNoise)   (None, 736, 512, 67)    0            concatenate_2[0][0]                  
______________________________________________________________________________________________________________
conv2d_10 (Conv2D)                  (None, 736, 512, 67)    40468        gaussian_noise_12[0][0]              
______________________________________________________________________________________________________________
batch_normalization_10 (BatchNormal (None, 736, 512, 67)    268          conv2d_10[0][0]                      
______________________________________________________________________________________________________________
gaussian_noise_13 (GaussianNoise)   (None, 736, 512, 67)    0            batch_normalization_10[0][0]         
______________________________________________________________________________________________________________
conv2d_11 (Conv2D)                  (None, 736, 512, 67)    40468        gaussian_noise_13[0][0]              
______________________________________________________________________________________________________________
batch_normalization_11 (BatchNormal (None, 736, 512, 67)    268          conv2d_11[0][0]                      
______________________________________________________________________________________________________________
gaussian_noise_14 (GaussianNoise)   (None, 736, 512, 67)    0            batch_normalization_11[0][0]         
______________________________________________________________________________________________________________
conv2d_12 (Conv2D)                  (None, 736, 512, 32)    19328        gaussian_noise_14[0][0]              
______________________________________________________________________________________________________________
batch_normalization_12 (BatchNormal (None, 736, 512, 32)    128          conv2d_12[0][0]                      
______________________________________________________________________________________________________________
gaussian_noise_15 (GaussianNoise)   (None, 736, 512, 32)    0            batch_normalization_12[0][0]         
______________________________________________________________________________________________________________
conv2d_13 (Conv2D)                  (None, 736, 512, 3)     867          gaussian_noise_15[0][0]              
==============================================================================================================
Total params: 372,771
Trainable params: 371,415
Non-trainable params: 1,356
___________________________

1 个答案:

答案 0 :(得分:1)

@FCOS我认为差异是由于您尚未训练一种模型而另一种模型是经过训练的事实造成的。

保存训练有素的模型时,它会保存

  1. 模型架构,
  2. 权重和偏见,以及
  3. 优化程序的配置

但是,当您保存未经训练的模型时,它将没有优化器的配置。

为了测试大小差异,我创建了带有和不带有输入大小的简单模型,并且发现两个模型的大小完全相同,因为两个模型的参数数量相同。请检查下面的model1model2

这是模型1

import tensorflow as tf

model1 = tf.keras.models.Sequential([
  tf.keras.layers.Dense(128, activation='relu',input_shape=(None, None, 784,)),
  tf.keras.layers.Dense(256, activation='relu'),
  tf.keras.layers.Dense(512, activation='relu'),
  tf.keras.layers.Dense(256, activation='relu'),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(64, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])

model1.save('mymodel1.h5',overwrite=True,include_optimizer=True)
model1.summary()

Model: "sequential_10"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_20 (Dense)             (None, None, None, 128)   100480    
_________________________________________________________________
dense_21 (Dense)             (None, None, None, 256)   33024     
_________________________________________________________________
dense_22 (Dense)             (None, None, None, 512)   131584    
_________________________________________________________________
dense_23 (Dense)             (None, None, None, 256)   131328    
_________________________________________________________________
dense_24 (Dense)             (None, None, None, 128)   32896     
_________________________________________________________________
dense_25 (Dense)             (None, None, None, 64)    8256      
_________________________________________________________________
dense_26 (Dense)             (None, None, None, 10)    650       
=================================================================
Total params: 438,218
Trainable params: 438,218
Non-trainable params: 0

这是模型2

model2 = tf.keras.models.Sequential([
  tf.keras.layers.Dense(128, activation='relu',input_shape=(300, 300, 784,)),
  tf.keras.layers.Dense(256, activation='relu'),
  tf.keras.layers.Dense(512, activation='relu'),
  tf.keras.layers.Dense(256, activation='relu'),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(64, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])

model2.save('mymodel2.h5',overwrite=True,include_optimizer=True)
model2.summary()

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_27 (Dense)             (None, 300, 300, 128)     100480    
_________________________________________________________________
dense_28 (Dense)             (None, 300, 300, 256)     33024     
_________________________________________________________________
dense_29 (Dense)             (None, 300, 300, 512)     131584    
_________________________________________________________________
dense_30 (Dense)             (None, 300, 300, 256)     131328    
_________________________________________________________________
dense_31 (Dense)             (None, 300, 300, 128)     32896     
_________________________________________________________________
dense_32 (Dense)             (None, 300, 300, 64)      8256      
_________________________________________________________________
dense_33 (Dense)             (None, 300, 300, 10)      650       
=================================================================
Total params: 438,218
Trainable params: 438,218
Non-trainable params: 0
_________________________________________________________________

模式#1和模型#2的大小相同(1.7 MB)。

如果您有任何意见,请通知我们。谢谢!