Tensorflow Nueral Network无法正常工作

时间:2017-10-14 03:04:41

标签: python machine-learning tensorflow

感谢您考虑回答我的问题。我在使用TensorFlow时遇到问题,我输入数据并继续输出:

('Epoch ', 0, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 1, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 2, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 3, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 4, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 5, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 6, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 7, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 8, ' completed out of ', 10, 'loss:', nan)
('Epoch ', 9, ' completed out of ', 10, 'loss:', nan)
('Accuracy:', 1.0)

我的X_train数据是一个500乘1000的矩阵,其中每一行都包含数字,例如:

-0.38484444, 1.4542222222 ...

我希望你明白这个想法...... 我的Y_train数据由二进制分类(0,1)组成。 len(X_train [0])返回1000,即样本量(列)

我不太清楚我还需要弄清楚我的问题;我将包含我的简单TensorFlow代码,如果您需要有关我的代码或问题的更多说明,请告诉我。

感谢您的时间

import tensorflow as tf
import pandas as pd
import numpy as np

da = pd.read_csv("data.csv", header=None)
ta = pd.read_csv("BMI.csv")

X_data = da.iloc[:, :1000]
Y_data = np.expand_dims(ta.iloc[:, -1], axis = 1)

X_train = X_data.iloc[:500 :,]
X_test = X_data.iloc[500:,:]

Y_train = Y_data[:500 :,]
Y_test = Y_data[735:,:]


X_train = np.array(X_train)
X_test = np.array(X_test)

n_nodes_hl1 = 500
n_nodes_hl2 = 500
n_nodes_hl3 = 500

n_classes = 1
batch_size = 10

x = tf.placeholder('float', [None, len(X_train[0])])
y = tf.placeholder('float')

def neural_network_model(data):
    hidden_1_layer = {'weights': tf.Variable(tf.random_normal([len(X_train[0]), n_nodes_hl1])),
                      'biases': tf.Variable(tf.random_normal([n_nodes_hl1]))}
    hidden_2_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
                      'biases': tf.Variable(tf.random_normal([n_nodes_hl2]))}
    hidden_3_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
                      'biases': tf.Variable(tf.random_normal([n_nodes_hl3]))}
    output_layer = {'weights': tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
                      'biases': tf.Variable(tf.random_normal([n_classes]))}

    l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['biases'])
    l1 = tf.nn.relu(l1)

    l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['biases'])
    l2 = tf.nn.relu(l2)

    l3 = tf.add(tf.matmul(l2, hidden_3_layer['weights']), hidden_3_layer['biases'])
    l3 = tf.nn.relu(l3)

    output = tf.matmul(l3, output_layer['weights']) + output_layer['biases']

    return output


def train_nueral_network(x):
    prediction = neural_network_model(x)
    cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y) )
    optimizer = tf.train.AdamOptimizer().minimize(cost)

    hm_epochs = 10

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())

        for epoch in range(hm_epochs):
            epoch_loss = 0

            i = 0
            while i < len(X_train[0]):
                start = i
                end = i + batch_size

                batch_x = np.array(X_train[start:end])
                batch_y = np.array(Y_train[start:end])

                _, c = sess.run([optimizer, cost], feed_dict= {x: batch_x, y: batch_y})
                epoch_loss += c
                i += batch_size


            print('Epoch ', epoch, ' completed out of ', hm_epochs, 'loss:', epoch_loss)

        correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
        print('Accuracy:', accuracy.eval({x:X_test, y:Y_test}))


train_nueral_network(x)   

2 个答案:

答案 0 :(得分:0)

你可以用tf.nn.tanh替换tf.nn.relu并再次训练,并检查你是否会得到相同的结果。有时,ReLU会引起消失的梯度问题。

https://ayearofai.com/rohan-4-the-vanishing-gradient-problem-ec68f76ffb9b

答案 1 :(得分:0)

您的n_classes=1。因此,您将softmax应用于单个神经元之上,该神经元应始终评估为1.您应该设置n_classes=2

此外,在当前设置中,您使用的准确度评估始终是100%正确:

correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))

这是因为predictiony的形状都为(BATCH_SIZE,1),因此argmax对于所有样本总是为0。

我建议在一个热门的表示中代表y。完成后,其余代码应该可以正常工作。