无法在keras中训练模型

时间:2018-07-06 05:52:47

标签: python python-2.7 keras conv-neural-network

我正在尝试使用aflw dataset来训练喀拉拉邦人脸分析的多合一卷积模型,其大小约为19.2 GB。它成功显示了模型摘要,但无法训练模型。

我有一台RAM大约为4 GB的计算机。

Loading pickle files
Loaded train, test and validation dataset
Loading test images
Loading validation images
dataset/adience.py:100: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  self.test_detection = self.test_dataset["is_face"].as_matrix()
Loaded all dataset and images
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 227, 227, 1)  0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 55, 55, 96)   11712       input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 55, 55, 96)   384         conv2d_1[0][0]                   
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 27, 27, 96)   0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 27, 27, 256)  614656      max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 27, 27, 256)  1024        conv2d_2[0][0]                   
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 13, 13, 256)  0           batch_normalization_2[0][0]      
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 13, 13, 384)  885120      max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 13, 13, 384)  1327488     conv2d_3[0][0]                   
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 13, 13, 512)  1769984     conv2d_4[0][0]                   
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 6, 6, 256)    393472      max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 6, 6, 256)    393472      conv2d_3[0][0]                   
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 6, 6, 512)    0           conv2d_5[0][0]                   
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 6, 6, 1024)   0           conv2d_8[0][0]                   
                                                                 conv2d_9[0][0]                   
                                                                 max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 6, 6, 256)    262400      concatenate_1[0][0]              
__________________________________________________________________________________________________
flatten_2 (Flatten)             (None, 9216)         0           conv2d_10[0][0]                  
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 2048)         18876416    flatten_2[0][0]                  
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 2048)         0           dense_3[0][0]                    
__________________________________________________________________________________________________
dense_11 (Dense)                (None, 512)          1049088     dropout_3[0][0]                  
__________________________________________________________________________________________________
dropout_10 (Dropout)            (None, 512)          0           dense_11[0][0]                   
__________________________________________________________________________________________________
detection_probablity (Dense)    (None, 2)            1026        dropout_10[0][0]                 
==================================================================================================
Total params: 25,586,242
Trainable params: 25,585,538
Non-trainable params: 704
__________________________________________________________________________________________________
Epoch 1/10

它表示时代1/10,但停止了。我的计算机的计算问题有问题吗?

1 个答案:

答案 0 :(得分:1)

如果它开始那样运行,那么它可能有足够的内存来正常运行。您可以检查资源监视器以查看有多少可用内存。您还可以检查是否有CPU使用率。如果有CPU使用率,则可能只是训练很慢。

那是一个相当大的模型,因此在小型CPU上进行训练可能会花费很长时间。

确保将Keras的详细程度设置为1,以便每批次打印一次信息。尽管这是默认设置,所以除非您进行了更改,否则应该已经用这种方式设置了。

model.fit(verbose=1)

还尝试将批量大小减小到1,看看是否有任何输出(因为这样可以更快地完成较小的批处理)。

如果它运行正常但缓慢,您最好的选择是使用GPU来运行它。如果您不能这样做,则可以尝试从源代码编译Tensorflow,以确保拥有所有CPU指令集和MKL库(如果需要),从而可以加快速度。