Question

I'm attempting to complete code for backpropagation, and the final step that I have is computing the change in weights and biases (using a quadrati cost). This step involves performing a matrix multiplication on two arrays after transposing one.

# necessary functions for this example
def sigmoid(z):
    return 1.0/(1.0+np.exp(-z))

def prime(z):
    return sigmoid(z) * (1-sigmoid(z))

def cost_derivative(output_activations, y):
    return (output_activations-y)

# Mock weight and bias matrices
weights = [np.array([[ 1, 0, 2], 
                     [2, -1, 0], 
                     [4, -1, 0], 
                     [1, 3, -2],
                     [0, 0, -1]]), 
           np.array([[2, 0, -1, -1, 2],
                     [0, 2, -1, -1, 0]])]

biases = [np.array([-1, 2, 0, 0, 4]), np.array([-2, 1])]

# The mock training example
q = [(np.array([1, -2, 3]), np.array([0, 1])), 
     (np.array([2, -3, 5]), np.array([1, 0])),
     (np.array([3, 6, -1]), np.array([1, 0])),
     (np.array([4, -1, -1]), np.array([0, 0]))]

nabla_b = [np.zeros(b.shape) for b in biases]
nabla_w = [np.zeros(w.shape) for w in weights]

for x, y in q:
        activation = x
        activations = [x]
        zs = []
        for w, b in zip(weights, biases): 
            z = np.dot(w, activation) + b
            zs.append(z)
            activation = sigmoid(z)
            activations.append(activation)

    # Computation of last layer
    delta = cost_derivative(activations[-1], y) * prime(zs[-1])
    nabla_b[-1] = delta
    nabla_w[-1] = np.dot(np.transpose(activations[-2]), delta) + biases

I've printed the outputs for delta and the first instance gives [ 0.14541528 -0.14808645] which is a 1x2 matrix and

activations[-2] = [9.97527377e-01   9.97527377e-01   9.97527377e-01   1.67014218e-05   7.31058579e-01]

which is a 1x5 matrix. Now transposing activations[-2] should give a 1x5 and the resulting multiplication should yield a 5x2 matrix but doesn't

Performing Matrix Multiplication After Transposing One Matrix/Vector

0 个答案: