使用带有数据集类的TensorFlow提升到正方形

时间:2017-12-04 16:47:30

标签: python tensorflow neural-network tensorflow-datasets

我想写一个神经网络,寻找没有预定义模型的x ^ 2分布。确切地说,在[-1,1]中给出了一些点,用它们的方块进行训练,然后它必须重现和预测类似于例如[-10,10]。 我或多或少地做了 - 没有数据集。但后来我尝试修改它以便使用数据集并学习如何使用它。现在,我成功地使程序运行,但输出比之前更糟,主要是它的常数为0.

以前的版本就像[-1,1]中的x ^ 2一样,具有线性延长,这更好.. Previous output 蓝线现在平坦。目标是与红色的一致......

在这里,评论是波兰语,对不起。

# square2.py - drugie podejscie do trenowania sieci za pomocą Tensorflow
# cel: nauczyć sieć rozpoznawać rozkład x**2
# analiza skryptu z:
# https://stackoverflow.com/questions/43140591/neural-network-to-predict-nth-square

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow.python.framework.ops import reset_default_graph

# def. danych do trenowania sieci
# x_train = (np.random.rand(10**3)*4-2).reshape(-1,1)
# y_train = x_train**2
square2_dane = np.load("square2_dane.npz")
x_train = square2_dane['x_tren'].reshape(-1,1)
y_train = square2_dane['y_tren'].reshape(-1,1) 

# zoptymalizować dzielenie danych
# x_train = square2_dane['x_tren'].reshape(-1,1)
# ds_x = tf.data.Dataset.from_tensor_slices(x_train)
# batch_x = ds_x.batch(rozm_paczki)
# iterator = ds_x.make_one_shot_iterator()

# określenie parametrów sieci
wymiary = [50,50,50,1]
epoki = 500
rozm_paczki = 200

reset_default_graph()
X = tf.placeholder(tf.float32, shape=[None,1])
Y = tf.placeholder(tf.float32, shape=[None,1])

weights = []
biases = []
n_inputs = 1

# inicjalizacja zmiennych
for i,n_outputs in enumerate(wymiary):
    with tf.variable_scope("layer_{}".format(i)):
        w = tf.get_variable(name="W", shape=[n_inputs,n_outputs],initializer = tf.random_normal_initializer(mean=0.0,stddev=0.02,seed=42))
        b=tf.get_variable(name="b",shape=[n_outputs],initializer=tf.zeros_initializer)
        weights.append(w)
        biases.append(b)
        n_inputs=n_outputs

def forward_pass(X,weights,biases):
    h=X
    for i in range(len(weights)):
        h=tf.add(tf.matmul(h,weights[i]),biases[i])
        h=tf.nn.relu(h)
    return h    

output_layer = forward_pass(X,weights,biases)
f_strat = tf.reduce_mean(tf.squared_difference(output_layer,Y),1)
f_strat = tf.reduce_sum(f_strat)
# alternatywna funkcja straty
#f_strat2 = tf.reduce_sum(tf.abs(Y-y_train)/y_train)
optimizer = tf.train.AdamOptimizer(learning_rate=0.003).minimize(f_strat)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # trenowanie
    dataset = tf.data.Dataset.from_tensor_slices((x_train,y_train))
    dataset = dataset.batch(rozm_paczki)
    dataset = dataset.repeat(epoki)
    iterator = dataset.make_one_shot_iterator()
    ds_x, ds_y = iterator.get_next()
    sess.run(optimizer, {X: sess.run(ds_x), Y: sess.run(ds_y)})
    saver = tf.train.Saver()
    save = saver.save(sess, "./model.ckpt")
    print("Model zapisano jako: %s" % save)

    # puszczenie sieci na danych
    x_test = np.linspace(-1,1,600)
    network_outputs = sess.run(output_layer,feed_dict = {X :x_test.reshape(-1,1)})

plt.plot(x_test,x_test**2,color='r',label='y=x^2')
plt.plot(x_test,network_outputs,color='b',label='sieć NN')
plt.legend(loc='right')
plt.show()

我认为问题在于输入训练数据 sess.run(optimizer, {X: sess.run(ds_x), Y: sess.run(ds_y)}) 或者使用ds_x,ds_y的定义。这是我的第一个这样的节目.. 所以这是行的输出('see'块的内容)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # trenowanie
    for i in range(epoki):
        idx = np.arange(len(x_train))
        np.random.shuffle(idx)
        for j in range(len(x_train)//rozm_paczki):
            cur_idx = idx[rozm_paczki*j:(rozm_paczki+1)*j]
            sess.run(optimizer,feed_dict = {X:x_train[cur_idx],Y:y_train[cur_idx]})
    saver = tf.train.Saver()
    save = saver.save(sess, "./model.ckpt")
    print("Model zapisano jako: %s" % save)

谢谢!

P.S。:我受到Neural Network to predict nth square

的高度启发

1 个答案:

答案 0 :(得分:1)

有两个问题共同导致你的模型准确性差,并且都涉及这一行:

sess.run(optimizer, {X: sess.run(ds_x), Y: sess.run(ds_y)})
  1. 只执行一个训练步骤,因为此代码不在循环中。您的原始代码运行了len(x_train)//rozm_paczki个步骤,这些步骤应该取得更多进展。

  2. sess.run(ds_x)sess.run(ds_y)的两次调用分别在不同的步骤中运行,这意味着它们将包含不同批次中不相关的值。每次调用sess.run(ds_x)sess.run(ds_y)都会将Iterator移至下一批,并丢弃您在sess.run()调用中未明确请求的输入元素的任何部分。从本质上讲,您将从批次 i 获得X,从批次 i + 1 获得Y(反之亦然),模型将在无效数据。如果您想从同一批次中获取值,则需要在一次sess.run([ds_x, ds_y])调用中执行此操作。

  3. 还有两个问题可能影响效率:

    1. Dataset未被洗牌。您的原始代码会在每个纪元的开头调用np.random.shuffle()。您应该在dataset = dataset.shuffle(len(x_train))之前添加dataset = dataset.repeat()

    2. 将值从Iterator提取回Python(例如,当您执行sess.run(ds_x))并将其反馈到训练步骤时效率很低。将Iterator.get_next()操作的输出直接作为输入传递给前馈步骤会更有效。

    3. 将这些全部放在一起,这是一个重写的程序版本,可以解决这四个问题,并获得正确的结果。 (不幸的是,我的波兰语不足以保留评论,因此我已翻译成英语。)

      import tensorflow as tf
      import matplotlib.pyplot as plt
      import numpy as np
      
      # Generate training data.
      x_train = np.random.rand(10**3, 1).astype(np.float32) * 4 - 2
      y_train = x_train ** 2
      
      # Define hyperparameters.
      DIMENSIONS = [50,50,50,1]
      NUM_EPOCHS = 500
      BATCH_SIZE = 200
      
      dataset = tf.data.Dataset.from_tensor_slices((x_train,y_train))
      dataset = dataset.shuffle(len(x_train))  # (Point 3.) Shuffle each epoch.
      dataset = dataset.repeat(NUM_EPOCHS)
      dataset = dataset.batch(BATCH_SIZE)
      iterator = dataset.make_one_shot_iterator()
      
      # (Point 2.) Ensure that `X` and `Y` correspond to the same batch of data.
      # (Point 4.) Pass the tensors returned from `iterator.get_next()`
      # directly as the input of the network.
      X, Y = iterator.get_next()
      
      # Initialize variables.
      weights = []
      biases = []
      n_inputs = 1
      for i, n_outputs in enumerate(DIMENSIONS):
        with tf.variable_scope("layer_{}".format(i)):
          w = tf.get_variable(name="W", shape=[n_inputs, n_outputs],
                              initializer=tf.random_normal_initializer(
                                  mean=0.0, stddev=0.02, seed=42))
          b = tf.get_variable(name="b", shape=[n_outputs],
                              initializer=tf.zeros_initializer)
          weights.append(w)
          biases.append(b)
          n_inputs = n_outputs
      
      def forward_pass(X,weights,biases):
        h = X
        for i in range(len(weights)):
          h=tf.add(tf.matmul(h, weights[i]), biases[i])
          h=tf.nn.relu(h)
        return h
      
      output_layer = forward_pass(X, weights, biases)
      loss = tf.reduce_sum(tf.reduce_mean(
          tf.squared_difference(output_layer, Y), 1))
      optimizer = tf.train.AdamOptimizer(learning_rate=0.003).minimize(loss)
      saver = tf.train.Saver()
      
      with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
      
        # (Point 1.) Run the `optimizer` in a loop. Use try-while-except to iterate
        # until all elements in `dataset` have been consumed.
        try:
          while True:
            sess.run(optimizer)
        except tf.errors.OutOfRangeError:
          pass
      
        save = saver.save(sess, "./model.ckpt")
        print("Model saved to path: %s" % save)
      
        # Evaluate network.
        x_test = np.linspace(-1, 1, 600)
        network_outputs = sess.run(output_layer, feed_dict={X: x_test.reshape(-1, 1)})
      
      plt.plot(x_test,x_test**2,color='r',label='y=x^2')
      plt.plot(x_test,network_outputs,color='b',label='NN prediction')
      plt.legend(loc='right')
      plt.show()