Question

我正在尝试建立我的第一个神经网络，并且在尝试找到点积时遇到问题。

当我尝试训练模型时。

我已经看到我的体重矩阵尺寸和激活矩阵尺寸需要对齐，但是我不知道该怎么做。我认为我之所以如此努力，是因为我不完全理解为什么我的激活功能变得如此之小。这些是我向前传播的步骤

def linear_forward(A_prev, W, b):
   Z = numpy.dot(W, A_prev) + b
   cache = (A_prev, W, b)
   return Z, cache

def dropout(A,  dropout):
   keep_ratio = 1- dropout
   D1 =  numpy.random.rand(A.shape[0], A.shape[1])                                    
   D1 = D1 <   keep_ratio   
   A = A * D1  
   A = A /  keep_ratio
   return A

def linear_activation_forward(A_prev, W, b, dropout_val):
   Z, linear_cache = linear_forward(A_prev, W, b)
   A, activation_cache = Relu.relu(Z)
   if(dropout != 0 ):
      A = dropout(A,dropout_val)
   cache = (linear_cache, activation_cache)

return A, cache


def L_model_forward(X, parameters, dropout):
   A = X                           
   caches = []                     
   L = len(parameters) // 2        

    for l in range(1, L):
       A_prev = A
       A, cache = linear_activation_forward(
          A_prev, parameters["W" + str(l)], parameters["b" + str(l)],  dropout)
    caches.append(cache)

    AL, cache = linear_activation_forward(
    A, parameters["W" + str(L)], parameters["b" + str(L)])
    caches.append(cache)

   if(dropout != 0):
      AL = dropout(AL, dropout)

   assert AL.shape == (1, X.shape[1])
   return AL, caches

这就是我初始化参数的方式

def initialize_parameters_deep( input_count, unit_count,layer_count):  
parameters = {}

# number of layers in the network
L = layer_count 

for l in range(1, L):
    parameters['W' + str(l)] = numpy.random.randn(unit_count, input_count) * 
        numpy.sqrt(2./input_count)
    parameters['b' + str(l)] = numpy.zeros((unit_count, 1))

return parameters

Here's an image giving the dimensions of each layer in the network 我只是想知道是否有人可以指出我在哪里做错了正确的方向。

**编辑

我在第一层之后添加了矩阵尺寸，然后将其引发异常。

def L_model_forward(X, parameters, dropout):
   A = X                           
   caches = []                     
   L = len(parameters) // 2        

  for l in range(1, L):
    A_prev = A
    A, cache = linear_activation_forward(
      A_prev, parameters["W" + str(l)], parameters["b" + str(l)],  
         dropout)
    caches.append(cache)

  AL, cache = linear_activation_forward(
  A, parameters["W" + str(L)], parameters["b" + str(L)])
caches.append(cache)

l = 1（第一张图片） 1 = 2秒

形状（166,27648）和（166,760）未对齐：27648（dim 1）！= 166（dim 0）尝试训练神经网络时出错

0 个答案: