神经网络没有收敛,权重没有调整

时间:2016-06-14 15:55:48

标签: python machine-learning neural-network tensorflow

我尝试运行此NN以解决回归问题。然而,它们的权重在迭代期间没有调整,但保持不变。我尝试过不同的优化工具,例如Adadelta和基本梯度下降。 我还使用批量标准化来预处理数据。 这是我使用的代码以及日志,它们显示验证集上的错误以及权重和偏差保持不变:

from __future__ import print_function
import numpy as np
import tensorflow as tf
from six.moves import range

from load_trm_data import readDatasetFromHdf

datasetSize = 5000

train_dataset, train_labels, valid_dataset, valid_labels, test_dataset, test_labels  = readDatasetFromHdf(datasetSize)


print('Training set', train_dataset.shape, train_labels.shape)
print('Validation set', valid_dataset.shape, valid_labels.shape)
print('Test set', test_dataset.shape, test_labels.shape)

def accuracy(predictions, labels):
    return (100.0 * np.sum(np.argmax(predictions) == np.argmax(labels))/predictions.shape[0])


def error_mean(predictions, labels):
    return np.mean(np.power(predictions - labels,2))


batch_size = 128

n_input_param = 14
n_layer_one = 30
n_output_layer = 1

graph = tf.Graph()
with graph.as_default():

 # Input data.
   tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, n_input_param))
   tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size))
   tf_valid_dataset = tf.constant(valid_dataset)
   tf_test_dataset = tf.constant(test_dataset)

   # Variables.
   weights = tf.Variable(tf.truncated_normal([n_input_param, n_layer_one]))
   biases = tf.Variable(tf.zeros([n_layer_one]))

   weights2 = tf.Variable(tf.truncated_normal([n_layer_one, n_output_layer]))

   # Training computation.    
   mu = tf.Variable(0.0, name='mu')
   mu = tf.reduce_mean(tf_train_dataset, 0)
   variance = tf.reduce_sum(tf.pow(tf.sub(tf_train_dataset, mu), 2), 0) / batch_size
   epsilon = 0.001   

   batch_norm = tf.nn.batch_normalization(tf_train_dataset, mu, variance, None, None, epsilon)  


   logits = tf.nn.relu(tf.matmul(batch_norm, weights) + biases  )
   hidden = tf.matmul(logits, weights2) #+ biases2
   hidden = tf.Print(hidden, [hidden], "hidden: ")
   loss = tf.reduce_mean(tf.pow(tf.sub(hidden,tf_train_labels), 2)) 

   # Optimizer.
   #optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
   optimizer = tf.train.AdadeltaOptimizer(0.01, 0.95, 1e-08, False) 
   opt_op = optimizer.minimize(loss, var_list=[weights, weights2, biases])

   weights = tf.Print(weights, [weights], "Weights: ")
   weights2 = tf.Print(weights2, [weights2], "Weights2: ")
   biases = tf.Print(biases, [biases], "Biases: ")

  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(hidden)
  valid_prediction = tf.matmul(tf.nn.relu(tf.matmul(tf_valid_dataset, weights)+biases), weights2)          
  test_prediction = tf.matmul(tf.nn.relu(tf.matmul(tf_test_dataset, weights)+biases), weights2)  


  num_steps = 3001

  with tf.Session(graph=graph) as session:
      tf.initialize_all_variables().run()
      print("Initialized")
  for step in range(num_steps):

      offset = (step * batch_size) % (train_labels.shape[0] - batch_size)

      batch_data = train_dataset[offset:(offset + batch_size), :]
      batch_labels = train_labels[offset:(offset + batch_size)].reshape(batch_size,)    

      feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels} _, l, predictions = session.run([opt_op, loss, train_prediction], feed_dict=feed_dict)

      if (step % 500 == 0):
          print("Minibatch loss at step %d: %f" % (step, l))
          print("Minibatch error: %f" % error_mean(predictions, batch_labels))
          print("Validation error: %f" % error_mean(valid_prediction.eval(), valid_labels))

  print("Test mean error: %f" % error_mean(test_prediction.eval(), test_labels))

训练集(5000,14)(5000,1) 验证集(2500,14)(2500,1) 测试装置(2500,14)(2500,1) 初始化

步骤0的迷你补偿损失:57.574135 迷你补丁错误:20.750767 验证错误:63221.228800

步骤500处的迷你补偿损失:67.077072 迷你补丁错误:13.863234 验证错误:63221.228800

步骤1000处的迷你补丁损失:41.973755 迷你补丁错误:12.603945 验证错误:63221.228800

步骤1500的最小补偿损失:57.691139 迷你补丁错误:14.036787 验证错误:63221.228800

步骤2000的最小补偿损失:55.477966 迷你补丁错误:14.557659 验证错误:63221.228800

步骤2500:86.001404的最小补偿损失 迷你补丁错误:19.441383 验证错误:63221.228800

步骤3000处的迷你补偿损失:92.167984 迷你补丁错误:11.834186 验证错误:63221.228800 测试平均误差:54048.582031

tensorflow / core / kernels / logging_ops.cc:79]偏见:[0 0 0 ...] 我的tensorflow / core / kernels / logging_ops.cc:79]隐藏:[3.6555634 5.9512711 3.6907592 ...] 我tensorflow / core / kernels / logging_ops.cc:79]偏见:[0 0 0 ...] 我tensorflow / core / kernels / logging_ops.cc:79]权重2:[ - 0.12304196 1.0204707 0.86253721 ...] 我的tensorflow / core / kernels / logging_ops.cc:79]隐藏:[ - 2.7658644 16.363554 -4.9159303 ...] 我tensorflow / core / kernels / logging_ops.cc:79]偏见:[0 0 0 ...] 我tensorflow / core / kernels / logging_ops.cc:79]权重2:[ - 0.12304196 1.0204707 0.86253721 ...] 我的tensorflow / core / kernels / logging_ops.cc:79]隐藏:[ - 4.6260657 14.945541 -2.8992696 ...] 我tensorflow / core / kernels / logging_ops.cc:79]权重2:[ - 0.12304196 1.0204707 0.86253721 ...] 我tensorflow / core / kernels / logging_ops.cc:79]偏见:[0 0 0 ...] 我的tensorflow / core / kernels / logging_ops.cc:79]隐藏:[4.5799894 15.350223 3.151691 ...] 我tensorflow / core / kernels / logging_ops.cc:79]权重2:[ - 0.12304196 1.0204707 0.86253721 ...] 我tensorflow / core / kernels / logging_ops.cc:79]偏见:[0 0 0 ...] 我的tensorflow / core / kernels / logging_ops.cc:79]隐藏:[10.983672 13.306148 -3.630734 ...] 我tensorflow / core / kernels / logging_ops.cc:79]偏见:[0 0 0 ...] 我tensorflow / core / kernels / logging_ops.cc:79]权重:[1.0648602 -0.6320048 1.2979335 ...] 我tensorflow / core / kernels / logging_ops.cc:79]权重2:[ - 0.12304196 1.0204707 0.86253721 ...] 我tensorflow / core / kernels / logging_ops.cc:79]权重:[1.0648602 -0.6320048 1.2979335 ...] 我tensorflow / core / kernels / logging_ops.cc:79]偏见:[0 0 0 ...] 我tensorflow / core / kernels / logging_ops.cc:79]权重2:[ - 0.12304196 1.0204707 0.86253721 ...]

修改 我试图缩小示例代码:

import numpy as np
import tensorflow as tf
from six.moves import range
from load_trm_data import readDatasetFromHdf

datasetSize = 5000
batch_size = 128

train_dataset, train_labels, valid_dataset, valid_labels, test_dataset, test_labels  = readDatasetFromHdf(datasetSize)

def error_mean(predictions, labels):
    return np.mean(np.power(predictions - labels,2))

n_input_param = 14
n_layer_one = 30
n_output_layer = 1

graph = tf.Graph()
with graph.as_default():

  # Input data.
  tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, n_input_param))
  tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size))
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)

  # Variables.
  weights = tf.Variable(tf.truncated_normal([n_input_param, n_layer_one]))
  biases = tf.Variable(tf.zeros([n_layer_one]))  
  weights2 = tf.Variable(tf.truncated_normal([n_layer_one, n_output_layer]))

  #Batch normalization
  mu = tf.reduce_mean(tf_train_dataset, 0)
  variance = tf.reduce_sum(tf.pow(tf.sub(tf_train_dataset, mu), 2), 0) / batch_size    
  batch_norm = tf.nn.batch_normalization(tf_train_dataset, mu, variance, None, None, 0.001)  

  # Training computation
  logits = tf.nn.relu(tf.matmul(batch_norm, weights) + biases  )
  hidden = tf.nn.relu(tf.matmul(logits, weights2) )
  loss = tf.reduce_mean(tf.pow(tf.sub(hidden,tf_train_labels), 2)) 

  # Optimizer
  optimizer = tf.train.AdadeltaOptimizer(0.01, 0.95, 1e-08, False) 
  opt_op = optimizer.minimize(loss, var_list=[weights, weights2, biases])

  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(hidden)
  valid_prediction = tf.matmul(tf.nn.relu(tf.matmul(tf_valid_dataset, weights)+biases), weights2)          
  test_prediction = tf.matmul(tf.nn.relu(tf.matmul(tf_test_dataset, weights)+biases), weights2)  

num_steps = 3001

with tf.Session(graph=graph) as session:
  tf.initialize_all_variables().run()

  for step in range(num_steps):

    offset = (step * batch_size) % (train_labels.shape[0] - batch_size)

    batch_data = train_dataset[offset:(offset + batch_size), :]
    batch_labels = train_labels[offset:(offset + batch_size)].reshape(batch_size,)    

    feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
    _, l, predictions = session.run([opt_op, loss, train_prediction], feed_dict=feed_dict)

编辑:我是否正确处理了回归问题的概念设置(例如ReLu是否正确)?在输出节点上使用softmax是否正确?

0 个答案:

没有答案