双层神经网络的准确性没有提高

时间:2018-04-08 15:38:55

标签: python tensorflow machine-learning neural-network deep-learning

我对tensorflow很新,正在制作我的第一个双层神经网络。我正在利用UCI的心脏病数据集。

import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

RANDOM_SEED = 41
tf.set_random_seed(RANDOM_SEED)

def init_weights(shape):
    """ Weight initialization """
    weights = tf.random_normal(shape, stddev=0.1)
    return tf.Variable(weights)

def forwardprop(X, w_1, w_2, w_3):
    h_1   = tf.nn.sigmoid(tf.matmul(X, w_1))
    h_2   = tf.nn.sigmoid(tf.matmul(h_1, w_2))
    yhat = tf.nn.sigmoid(tf.matmul(h_2, w_3))
    return yhat

def get_heart_data():   
    disease = pd.read_csv('../data/disease.csv')

    disease.replace(to_replace="?", value = "u", inplace = True)
    disease = pd.get_dummies(disease, columns=['ca', 'thal', 'fbs', 'exang', 'slop', 'sex', 'cp'], drop_first=True)

    all_X = disease.drop(['pred_attribute'],1)
    all_y = disease['pred_attribute']
    all_y = pd.get_dummies(all_y, columns=['pred_attribute'], drop_first=False)
    return train_test_split(all_X, all_y, test_size=0.3, random_state=RANDOM_SEED)

def main():
    train_X, test_X, train_y, test_y = get_heart_data()

    # Layer's sizes
    x_size = 21
    h_1_size = 154 
    h_2_size = 79  
    y_size = 5

    # Symbols
    X = tf.placeholder("float", shape=[None, x_size])
    y = tf.placeholder("float", shape=[None, y_size])

    # Weight initializations
    w_1 = init_weights((x_size, h_1_size))
    w_2 = init_weights((h_1_size, h_2_size))
    w_3 = init_weights((h_2_size, y_size))

    # Forward propagation
    logits   = forwardprop(X, w_1, w_2, w_3)

    # Backward propagation
    cost    = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=logits))
    updates = tf.train.GradientDescentOptimizer(0.01).minimize(cost)

    # Run SGD
    sess = tf.Session()
    init = tf.global_variables_initializer()
    sess.run(init)


    for epoch in range(100):
        # Train with each example
        for i in range(len(train_X)):

            sess.run(updates, feed_dict={X: train_X, y: train_y })

        pred = tf.nn.softmax(logits)  # Apply softmax to logits
        correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
        training_accuracy = sess.run(accuracy, feed_dict={X: train_X, y: train_y})
        testing_accuracy = sess.run(accuracy, feed_dict={X: test_X, y: test_y})
        print("Epoch = %d, train accuracy = %.2f%%, test accuracy = %.2f%%"
          % (epoch + 1, 100 * training_accuracy, 100. * testing_accuracy))

    sess.close()

main()

我认为我已经正确设置了所有内容,但是当我运行该程序时,它反复给我一遍又一遍的相同精度。

Epoch = 1, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 2, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 3, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 4, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 5, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 6, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 7, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 8, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 9, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 10, train accuracy = 55.19%, test accuracy = 51.65%

这一直持续到纪元100.我已经尝试乘以100000以确定它是否只是稍微变化,但每次都保持不变。我不知道这是我的网络还是我的准确性功能还是别的什么 非常感谢您的帮助,
- 马特

2 个答案:

答案 0 :(得分:0)

您的测试和训练集在每个时代保持不变,那么为什么您会期望不同的结果?

我认为你想要的只是从少数几个样本开始,并在时代中添加更多:

n_epoch = 100
# Assuming of course that you have more than n_epoch samples in each of your sets
trainSamplesByEpoch = int(len(train_X) / n_epoch)      
for epoch in range(1,100):
    train_X_current = train_X[0:epoch*trainSamplesByEpoch] 
    train_y_current = train_y[0:epoch*trainSamplesByEpoch] 
    # Train your network with train_X_current, train_y_current 
    # Compute the train accuracy with train_X_current, train_y_current  
    # Compute the test accuracy with test_X, test_y

以这种方式使用时代,以便知道有多少样本足以达到所需的性能。使用所有样本可能会导致过度拟合模型。

答案 1 :(得分:0)

一个可能的问题是您构建网络的方式。 你到处都在使用非线性,甚至是你的输出层。

您正在使用的损失tf.nn.softmax_cross_entropy_with_logits()要求logits是线性函数。所以这就是构建网络的方式:

def forwardprop(X, w_1, w_2, w_3):
h_1   = tf.nn.sigmoid(tf.matmul(X, w_1))
h_2   = tf.nn.sigmoid(tf.matmul(h_1, w_2))
yhat = tf.matmul(h_2, w_3)
return yhat

当您转向更复杂的网络时,您构建代码的方式也会导致问题。 你不应该想到forwardprop()backprop()。在编写张量流代码时,请将其视为指定图形和必要的计算。请查看tensorflow tutorials,了解如何构建代码的黄金标准。