计算训练ConvNet所需的内存大小(使用Caffe)

时间:2016-08-19 02:48:06

标签: machine-learning deep-learning caffe conv-neural-network

当我们想要在特定(图像)数据上训练卷积神经网络时,是否可以给出公式/指南来计算所需的内存大小? 要考虑哪些设置?

我使用Caffe实现了一个简单的ConvNet,具有以下规范:ImageData-> Convolution-> InnerProduct-> SoftmaxWithLoss。我得到了这些内容:

I0817 21:32:48.073011 11306 layer_factory.hpp:77] Creating layer Layer1
I0817 21:32:48.073108 11306 net.cpp:91] Creating Layer Layer1
I0817 21:32:48.073148 11306 net.cpp:399] Layer1 -> data
I0817 21:32:48.073199 11306 net.cpp:399] Layer1 -> label
I0817 21:32:48.073256 11306 image_data_layer.cpp:38] Opening file ./data/ultrax/trainx/list.txt
I0817 21:32:48.073309 11306 image_data_layer.cpp:56] A total of 1 images.
I0817 21:32:48.084810 11306 image_data_layer.cpp:83] output data size: 32,3,224,224
I0817 21:32:48.151801 11306 net.cpp:141] Setting up Layer1
I0817 21:32:48.151892 11306 net.cpp:148] Top shape: 32 3 224 224 (4816896)
I0817 21:32:48.151921 11306 net.cpp:148] Top shape: 32 (32)
I0817 21:32:48.151942 11306 net.cpp:156] Memory required for data: 19267712
I0817 21:32:48.151968 11306 layer_factory.hpp:77] Creating layer Layer2
I0817 21:32:48.152020 11306 net.cpp:91] Creating Layer Layer2
I0817 21:32:48.152068 11306 net.cpp:425] Layer2 <- data
I0817 21:32:48.152104 11306 net.cpp:399] Layer2 -> conv1
I0817 21:32:48.152740 11306 net.cpp:141] Setting up Layer2
I0817 21:32:48.152771 11306 net.cpp:148] Top shape: 32 64 216 216 (95551488)
I0817 21:32:48.152796 11306 net.cpp:156] Memory required for data: 401473664
I0817 21:32:48.152830 11306 layer_factory.hpp:77] Creating layer Layer3
I0817 21:32:48.152863 11306 net.cpp:91] Creating Layer Layer3
I0817 21:32:48.152885 11306 net.cpp:425] Layer3 <- conv1
I0817 21:32:48.152910 11306 net.cpp:399] Layer3 -> fc
I0817 21:33:05.273979 11306 net.cpp:141] Setting up Layer3
I0817 21:33:05.274063 11306 net.cpp:148] Top shape: 32 64 (2048)
I0817 21:33:05.274085 11306 net.cpp:156] Memory required for data: 401481856
I0817 21:33:05.274127 11306 layer_factory.hpp:77] Creating layer loss
I0817 21:33:05.512080 11306 net.cpp:91] Creating Layer loss
I0817 21:33:05.512157 11306 net.cpp:425] loss <- fc
I0817 21:33:05.512195 11306 net.cpp:425] loss <- label
I0817 21:33:05.512229 11306 net.cpp:399] loss -> loss
I0817 21:33:05.512287 11306 layer_factory.hpp:77] Creating layer loss
I0817 21:33:05.512351 11306 net.cpp:141] Setting up loss
I0817 21:33:05.512387 11306 net.cpp:148] Top shape: (1)
I0817 21:33:05.512413 11306 net.cpp:151]     with loss weight 1
I0817 21:33:05.710017 11306 net.cpp:156] Memory required for data: 401481860
I0817 21:33:05.710049 11306 net.cpp:217] loss needs backward computation.
I0817 21:33:05.710068 11306 net.cpp:217] Layer3 needs backward computation.
I0817 21:33:05.710084 11306 net.cpp:217] Layer2 needs backward computation.
I0817 21:33:05.733338 11306 net.cpp:219] Layer1 does not need backward computation.
I0817 21:33:05.733397 11306 net.cpp:261] This network produces output loss
I0817 21:33:05.733440 11306 net.cpp:274] Network initialization done.
I0817 21:33:06.133980 11306 solver.cpp:60] Solver scaffolding done.
I0817 21:33:06.459244 11306 caffe.cpp:219] Starting Optimization
I0817 21:33:06.483875 11306 solver.cpp:279] Solving UltraNerveSegmentation
I0817 21:33:06.483947 11306 solver.cpp:280] Learning Rate Policy: step
I0817 21:33:20.800559 11306 solver.cpp:337] Iteration 0, Testing net (#0)
I0817 21:42:49.588776 11306 solver.cpp:404]     Test net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0817 21:48:44.556177 11306 solver.cpp:228] Iteration 0, loss = 87.3365
I0817 21:48:46.329630 11306 solver.cpp:244]     Train net output #0: loss = 87.3365 (* 1 = 87.3365 loss)
I0817 21:48:46.760141 11306 sgd_solver.cpp:106] Iteration 0, lr = 0.0001
Killed

here中,有人说这可能是记忆问题。因此,恕我直言,如果我们能够在训练网之前估计所需的记忆力,那么在等待这么长时间之后我们就不会结束被杀死的过程将会很好。

1 个答案:

答案 0 :(得分:1)

假设您正在使用32位浮点数,则可以估计每个卷积层的每个元素都是一个浮点数。

因此,对于每一层,根据过滤器大小计算该层中的神经元数量,将所有层相加并乘以32得到多少位,然后乘以4得到您使用的字节数。除了输出层之外,您还必须对conv,pooling和normalizing图层执行此操作。此外,无论您的批量大小是多少,您都必须乘以,因为您的计算机可能会将整个批次加载到内存中(除非有一种方法可以批量处理流程)。