我正在尝试使用aflw dataset来训练喀拉拉邦人脸分析的多合一卷积模型,其大小约为19.2 GB。它成功显示了模型摘要,但无法训练模型。
我有一台RAM大约为4 GB的计算机。
Loading pickle files
Loaded train, test and validation dataset
Loading test images
Loading validation images
dataset/adience.py:100: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
self.test_detection = self.test_dataset["is_face"].as_matrix()
Loaded all dataset and images
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 227, 227, 1) 0
__________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 55, 55, 96) 11712 input_1[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 55, 55, 96) 384 conv2d_1[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 27, 27, 96) 0 batch_normalization_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 27, 27, 256) 614656 max_pooling2d_1[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 27, 27, 256) 1024 conv2d_2[0][0]
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D) (None, 13, 13, 256) 0 batch_normalization_2[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 13, 13, 384) 885120 max_pooling2d_2[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D) (None, 13, 13, 384) 1327488 conv2d_3[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 13, 13, 512) 1769984 conv2d_4[0][0]
__________________________________________________________________________________________________
conv2d_8 (Conv2D) (None, 6, 6, 256) 393472 max_pooling2d_1[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D) (None, 6, 6, 256) 393472 conv2d_3[0][0]
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D) (None, 6, 6, 512) 0 conv2d_5[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 6, 6, 1024) 0 conv2d_8[0][0]
conv2d_9[0][0]
max_pooling2d_4[0][0]
__________________________________________________________________________________________________
conv2d_10 (Conv2D) (None, 6, 6, 256) 262400 concatenate_1[0][0]
__________________________________________________________________________________________________
flatten_2 (Flatten) (None, 9216) 0 conv2d_10[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 2048) 18876416 flatten_2[0][0]
__________________________________________________________________________________________________
dropout_3 (Dropout) (None, 2048) 0 dense_3[0][0]
__________________________________________________________________________________________________
dense_11 (Dense) (None, 512) 1049088 dropout_3[0][0]
__________________________________________________________________________________________________
dropout_10 (Dropout) (None, 512) 0 dense_11[0][0]
__________________________________________________________________________________________________
detection_probablity (Dense) (None, 2) 1026 dropout_10[0][0]
==================================================================================================
Total params: 25,586,242
Trainable params: 25,585,538
Non-trainable params: 704
__________________________________________________________________________________________________
Epoch 1/10
它表示时代1/10,但停止了。我的计算机的计算问题有问题吗?
答案 0 :(得分:1)
如果它开始那样运行,那么它可能有足够的内存来正常运行。您可以检查资源监视器以查看有多少可用内存。您还可以检查是否有CPU使用率。如果有CPU使用率,则可能只是训练很慢。
那是一个相当大的模型,因此在小型CPU上进行训练可能会花费很长时间。
确保将Keras的详细程度设置为1,以便每批次打印一次信息。尽管这是默认设置,所以除非您进行了更改,否则应该已经用这种方式设置了。
model.fit(verbose=1)
还尝试将批量大小减小到1,看看是否有任何输出(因为这样可以更快地完成较小的批处理)。
如果它运行正常但缓慢,您最好的选择是使用GPU来运行它。如果您不能这样做,则可以尝试从源代码编译Tensorflow,以确保拥有所有CPU指令集和MKL库(如果需要),从而可以加快速度。