卷积神经网络的内存大小是多少?

时间:2016-07-03 08:41:44

标签: deep-learning conv-neural-network

我在看http://cs231n.github.io/convolutional-networks/

我不明白为什么第2层(CONV3-64:[224x224x64])的内存大小为224x224x 64

  1. 我知道有64个大小为3x3的过滤器,但为什么输入大小乘以64?
  2. 为什么图层中的权重数(CONV3-128)是(3x3x64)x128而不是 (3x3x64x64)x128? (前一层的权重乘以新的128个过滤器)
  3. 由于

    INPUT: [224x224x3]        memory:  224*224*3=150K   weights: 0
    CONV3-64: [224x224x64]  memory:  224*224*64=3.2M   weights: (3*3*3)*64 = 1,728
    CONV3-64: [224x224x64]  memory:  224*224*64=3.2M   weights: (3*3*64)*64 = 36,864
    POOL2: [112x112x64]  memory:  112*112*64=800K   weights: 0
    CONV3-128: [112x112x128]  memory:  112*112*128=1.6M   weights: (3*3*64)*128 = 73,728
    CONV3-128: [112x112x128]  memory:  112*112*128=1.6M   weights: (3*3*128)*128 = 147,456
    POOL2: [56x56x128]  memory:  56*56*128=400K   weights: 0
    CONV3-256: [56x56x256]  memory:  56*56*256=800K   weights: (3*3*128)*256 = 294,912
    CONV3-256: [56x56x256]  memory:  56*56*256=800K   weights: (3*3*256)*256 = 589,824
    CONV3-256: [56x56x256]  memory:  56*56*256=800K   weights: (3*3*256)*256 = 589,824
    POOL2: [28x28x256]  memory:  28*28*256=200K   weights: 0
    CONV3-512: [28x28x512]  memory:  28*28*512=400K   weights: (3*3*256)*512 = 1,179,648
    CONV3-512: [28x28x512]  memory:  28*28*512=400K   weights: (3*3*512)*512 = 2,359,296
    CONV3-512: [28x28x512]  memory:  28*28*512=400K   weights: (3*3*512)*512 = 2,359,296
    POOL2: [14x14x512]  memory:  14*14*512=100K   weights: 0
    CONV3-512: [14x14x512]  memory:  14*14*512=100K   weights: (3*3*512)*512 = 2,359,296
    CONV3-512: [14x14x512]  memory:  14*14*512=100K   weights: (3*3*512)*512 = 2,359,296
    CONV3-512: [14x14x512]  memory:  14*14*512=100K   weights: (3*3*512)*512 = 2,359,296
    POOL2: [7x7x512]  memory:  7*7*512=25K  weights: 0
    FC: [1x1x4096]  memory:  4096  weights: 7*7*512*4096 = 102,760,448
    FC: [1x1x4096]  memory:  4096  weights: 4096*4096 = 16,777,216
    FC: [1x1x1000]  memory:  1000 weights: 4096*1000 = 4,096,000
    
    TOTAL memory: 24M * 4 bytes ~= 93MB / image (only forward! ~*2 for bwd)
    TOTAL params: 138M parameters
    

1 个答案:

答案 0 :(得分:0)

您的第一个问题是指正向传递存储的内存。

  1. 属于CONV3-64图层的224x224x64中的64,因为当您通过单个224x224x3图像时,它会通过64个3x3x3滤镜,因此必须将64个新图像存储在内存中以传播这些效果64通过正向传递进入网络。
  2. 您的第二个是指网络中的权重参数。

    1. 在CONV3-128层中,输入为112x112x64,这意味着如果要应用单个3x3过滤器,实际上是对64个输入通道中的每一个应用不同的过滤器。您可以将输入视为由64个不同的3x3滤镜过滤的112x112x64音量,这可以被认为是3x3x64体积滤波器,它将输出单个112x112图像。此层的输出设置为128个通道,因此您必须执行此操作128次,因此此层中的128 * 64 * 3 * 3权重。