Question

我想用神经网络的反向传播算法来预测心脏病。为此，我使用了链接在这里的UCI心脏病数据集：processed cleveland。为此，我使用了以下博客中的cde：Build a flexible Neural Network with Backpropagation in Python并根据我自己的数据集稍微改了一下。我的代码如下：

import numpy as np
import csv

reader = csv.reader(open("cleveland_data.csv"), delimiter=",")
x = list(reader)
result = np.array(x).astype("float")

X = result[:, :13]
y0 = result[:, 13]
y1 = np.array([y0])
y = y1.T

# scale units
X = X / np.amax(X, axis=0)  # maximum of X array

class Neural_Network(object):
    def __init__(self):
        # parameters
        self.inputSize = 13
        self.outputSize = 1
        self.hiddenSize = 13

        # weights
        self.W1 = np.random.randn(self.inputSize, self.hiddenSize)  
        self.W2 = np.random.randn(self.hiddenSize, self.outputSize)  

    def forward(self, X):
        # forward propagation through our network
        self.z = np.dot(X, self.W1)  
        self.z2 = self.sigmoid(self.z)  # activation function
        self.z3 = np.dot(self.z2, self.W2)  
        o = self.sigmoid(self.z3)  # final activation function
        return o

    def sigmoid(self, s):
        # activation function
        return 1 / (1 + np.exp(-s))

    def sigmoidPrime(self, s):
        # derivative of sigmoid
        return s * (1 - s)

    def backward(self, X, y, o):
        # backward propgate through the network
        self.o_error = y - o  # error in output
        self.o_delta = self.o_error * self.sigmoidPrime(o)  # applying derivative of sigmoid to error

        self.z2_error = self.o_delta.dot(
            self.W2.T)  # z2 error: how much our hidden layer weights contributed to output error
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.z2)  # applying derivative of sigmoid to z2 error

        self.W1 += X.T.dot(self.z2_delta)  # adjusting first set (input --> hidden) weights
        self.W2 += self.z2.T.dot(self.o_delta)  # adjusting second set (hidden --> output) weights

    def train(self, X, y):
        o = self.forward(X)
        self.backward(X, y, o)


NN = Neural_Network()
for i in range(100):  # trains the NN 100 times
    print("Input: \n" + str(X))
    print("Actual Output: \n" + str(y))
    print("Predicted Output: \n" + str(NN.forward(X)))
    print("Loss: \n" + str(np.mean(np.square(y - NN.forward(X)))))  # mean sum squared loss
    print("\n")
    NN.train(X, y)

但是当我运行此代码时，我的所有预测输出在几次迭代后变为= 1，然后在所有100次迭代中保持相同。代码中的问题是什么？

Answer 1

我注意到的几个错误：

您的网络输出是一个sigmoid，即[0, 1]之间的值 - 适合预测概率。但目标似乎是[0, 4]之间的值。这解释了网络希望最大化输出以尽可能接近大标签的愿望。但它不能超过1.0并且卡住了。

您应该删除最终的sigmoid或预处理标签并将其缩放到[0, 1]。这两个选项都会让它学得更好。
您不使用学习率（有效地将其设置为1.0），这可能有点高，因此NN可能会发散。我的实验表明0.01是一个很好的学习率，但你可以随意使用它。

除此之外，你的反向支持似乎正常。

使用numpy和python为克利夫兰数据集实现反向传播

1 个答案: