ConvNet:最大池功能中未获得所需的输出

时间:2019-04-16 10:16:24

标签: python numpy deep-learning conv-neural-network

我正在尝试仅使用numpy来实现CNN。我在最大池化层中遇到了索引错误的问题。

该函数将要素图数组作为参数。特征图数组是一个ndarray。这是我的功能:

feature_map = np.array([[[4, 3, 4],[2, 4, 3],[2, 3, 4]],
          [[3, 4, 2],[2, 4, 4],[2, 4, 2]],
          [[5, 7, 6],[2, 1, 3],[3, 3, 8]],    
          [[3, 3, 2],[1, 3, 5],[7, 4, 9,]]])

def pool_forward(feature_map, mode = "max", size=2, stride=2):

    f_num, f_row, f_col = feature_map.shape
    #Preparing the output of the pooling operation.
    pool_out = np.zeros((np.uint16((f_row-size+1)/stride+1),
                        np.uint16((f_col-size+1)/stride+1), f_num))

    for map_num in range(f_num):
        r2 = 0
        for r in np.arange(0,f_row-size+1, stride):
            c2 = 0
            for c in np.arange(0, f_col-size+1, stride):
                pool_out[r2, c2, map_num] = np.max([feature_map[r:r+size,  
                                            c:c+size, map_num]])
                c2 = c2 + 1
            r2 = r2 +1

    return np.array(pool_res)

这是我得到的错误:

--------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-104-ccb65cb3a606> in <module>()
----> 1 feature_pool = pool_forward(features)
  2 feature_pool.shape

<ipython-input-102-5d4c4e76f99a> in pool_forward(feature_map, mode, 
filter_size, stride)
 13             c2 = 0
 14             for c in np.arange(0, f_col-filter_size+1, stride):
 ---> 15                 pool_out[r2, c2, map_num] = 
 np.max([feature_map[r:r+filter_size,  c:c+filter_size, map_num]])
 16                 c2 = c2 + 1
 17             r2 = r2 +1

IndexError: index 3 is out of bounds for axis 2 with size 3

在这里帮助我。

1 个答案:

答案 0 :(得分:2)

请检查答案的更新部分:

错误在于该行:

pool_out[r2, c2, map_num] = np.max([feature_map[r:r+size, c:c+size, map_num]])

应该是:

pool_out[r2, c2, map_num] = np.max([feature_map[map_num, r:r+size, c:c+size]])

现在:

def pool_forward(feature_map, mode = "max", size=2, stride=2):

    f_num, f_row, f_col = feature_map.shape
    #Preparing the output of the pooling operation.
    pool_out = np.zeros((np.uint16((f_row-size+1)/stride+1),
                        np.uint16((f_col-size+1)/stride+1), f_num))

    for map_num in range(f_num):
        r2 = 0
        for r in np.arange(0,f_row-size+1, stride):
            c2 = 0
            for c in np.arange(0, f_col-size+1, stride):
                pool_out[r2, c2, map_num] = np.max([feature_map[map_num, r:r+size, c:c+size]])
                c2 = c2 + 1
            r2 = r2 +1

    return np.array(pool_res)



feature_map = np.array([[[4, 3, 4],[2, 4, 3],[2, 3, 4]],
          [[3, 4, 2],[2, 4, 4],[2, 4, 2]],
          [[5, 7, 6],[2, 1, 3],[3, 3, 8]],    
          [[3, 3, 2],[1, 3, 5],[7, 4, 9,]]])

pool_forward(feature_map)

返回:

array([[[4., 4., 7., 3.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.]]])  

更新:问题的前提不正确。输入形状为3 * 3时,您的合并窗口大小为2 * 2,步幅为2,那么您可能需要查看fractional_max_pooling。对于常规max_pooling,应选择跨度为1(即值(f_row-size)/stride应该为整数)。在这种情况下,请查看以下代码:

feature_map = np.array([[[4, 3, 4],[2, 4, 3],[2, 3, 4]],
          [[3, 4, 2],[2, 4, 4],[2, 4, 2]],
          [[5, 7, 6],[2, 1, 3],[3, 3, 8]],    
          [[3, 3, 2],[1, 3, 5],[7, 4, 9,]]])

def pool_forward(feature_map, mode = "max", size=2, stride=1):
    f_num, f_row, f_col = feature_map.shape
    pool_out = np.zeros((f_num,np.uint16((f_row-size)/stride+1),\
                     np.uint16((f_col-size)/stride+1)))
    for z in range(f_num):
        for r in np.arange(0,f_row-size+1, stride):
            for c in np.arange(0, f_col-size+1, stride):
                pool_out[z, r, c] = np.max(feature_map[z, r:r+size, c:c+size])
    return pool_out

pool_forward(feature_map)返回:

array([[[4., 4.],
        [4., 4.]],

       [[4., 4.],
        [4., 4.]],

       [[7., 7.],
        [3., 8.]],

       [[3., 5.],
        [7., 9.]]])

这似乎是正确的。我也扔了变量c2和r2,因为它们似乎没有必要。