Question

我正在从头开始使用Pima Indians糖尿病数据集从头开始创建简单的神经网络，可以从UCI机器学习库下载。当我运行我的代码时，错误率总是相同的每次迭代我都不知道为什么会发生这种情况但是如果我使用XOR作为数据它可以正常工作。

这是我的代码

## Load Dependencies
import numpy as np
from sklearn.preprocessing import MinMaxScaler

## Seeding to reproduce random generated results
np.random.seed(1)

## We take input (X) and output (y)
data = np.loadtxt('diabetes.txt', delimiter=',')

scaler = MinMaxScaler()
scaler.fit(data)
data = scaler.transform(data)

X = data[:,0:8]
y = data[:,8].reshape(768,1)

## Define our activation function, in our case we will use sigmoid function: 1 / (1 + exp(-x))
def sigmoid(x, deriv=False):
    if(deriv == True):
        return x * (1 - x)
    return 1 / (1 + np.exp(-x))

## Initialize weights with random values 
wh = 2 * np.random.random((8, 768)) - 1
wo = 2 * np.random.random((768, 1)) - 1

# Training time
for i in range(1000):
    ## Forward propagation
    h0 = X

    ## input *  weigth + bias , activate
    h1   = sigmoid(np.dot(h0,wh))
    outl = sigmoid(np.dot(h1,wo))


    ## Compute the error of the predicted output layer to the actual result
    errorout = y - outl

    ## Compute the slope (Gradient/Derivative) of hidden and output layers Gradient of sigmoid can be returned as x * (1 – x).

    ## Compute change factor(delta) at output layer, 
    ## dependent on the gradient of error multiplied by the slope of output layer activation
    deltaoutl = errorout * sigmoid(outl,deriv=True)

    ## At this step, the error will propagate back into the network which means error at hidden layer. 
    ## For this, we will take the dot product of output layer delta with weight parameters of edges 
    ## between the hidden and output layer (wout.T).
    errorh1 = np.dot(deltaoutl,wo.T)

    ## Compute change factor(delta) at hidden layer, multiply the error at hidden layer with slope of hidden layer activation
    deltah1   = errorh1  * sigmoid(h1,deriv=True)

    ## Print error values 
    if i % 10000:
        print("Error :" + str(np.mean(np.abs(errorout))))

    ## Update weights at the output and hidden layer: 
    ## The weights in the network can be updated from the errors calculated for training example(s).
    wh += np.dot(h0.T,deltah1)
    wo += np.dot(h1.T,deltaoutl)

结果是：

Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
Error :0.651041666664
...

如果我们将数据更改为：

X = np.array([[0,0],
              [0,1],
              [1,0],
              [1,1]])
y = np.array([[0],
              [1],
              [1],
              [0]])

wh =  2 * np.random.random((2,4)) - 1
wo =  2 * np.random.random((4,1)) - 1

它的工作方式应该如此。我不明白为什么会发生这种情况请有人开导我谢谢。

使用皮马印第安人开始的糖尿病数据集的神经网络

0 个答案: