小神经网络不训练:错误的数据集或编码错误?

时间:2018-01-09 04:42:54

标签: python tensorflow deep-learning

NN-大师! 我是Tensorflow和Deep Learning的新手,我创建了一个完整连接NN的小教程。

将输入作为2个值,代表0和1范围内某些点的x和y,我想要区分......只要点位于中间。简单,对吧? 文字,我的输入和预期数据是:

inputs = [[0,0],
      [0,0.3],
      [0.3,0],
      [0.3,0.5],
      [0.5,0.3],
      [0.7,0.5],
      [0.5,0.6],
      [0.7,0],
      [0,0.6],
      [0.5,1],
      [0,1],
      [1,1],
      [1, 0.5]]

expected = [[0,1],
            [0,1],
            [0,1],
            [1,0],
            [1,0],
            [1,0],
            [1,0],
            [0,1],
            [0,1],
            [0,1],
            [0,1],
            [0,1],
            [0,1]]

所以我希望输出[0,1]来自那些“在边界”和[1,0]从中间。 然而,在100000次迭代(!)之后将其作为数据集似乎工作得很糟糕:边缘处的点被认为是好的,但是那些来自中间产生的输出接近[0.5,0.5],
 并指出训练集中没有的内容会产生更糟糕的结果。 我想知道导致这个错误的原因。是不是错误的任务设置,网络构建,参数等? 或者TensorFlow本身有错误用法? 所以这里是一个完整的.py文件:

    '''
Created on Jan 9, 2018

@author: noob
'''

import tensorflow as tf
import matplotlib.pyplot as plt

# trying to recognice points in the middle as 1,0
print("hello from python")
inputs = [[0,0],
          [0,0.3],
          [0.3,0],
          [0.3,0.5],
          [0.5,0.3],
          [0.7,0.5],
          [0.5,0.6],
          [0.7,0],
          [0,0.6],
          [0.5,1],
          [0,1],
          [1,1],
          [1, 0.5]]

expected = [[0,1],
            [0,1],
            [0,1],
            [1,0],
            [1,0],
            [1,0],
            [1,0],
            [0,1],
            [0,1],
            [0,1],
            [0,1],
            [0,1],
            [0,1]]


test_inputs = [[0.4,0.1],
          [0.5,0.5],
          [0.9,0.2],
          [0.4,0.6]]

test_expected = [[0,1],
                 [1,0],
                 [0,1],
                 [1,0]]



# ---------- Define Layers ---------------------
in_layer = tf.placeholder(tf.float32,[1,2])

in_to_middle = tf.Variable(tf.zeros([2,10]))

middle_layer = tf.Variable(tf.zeros([10]))

middle_to_out = tf.Variable(tf.zeros([10,2]))

out_layer = tf.Variable(tf.zeros([2]))

out_expected = tf.placeholder(tf.float32,[1,2])


middle_layer = tf.nn.sigmoid(tf.matmul(in_layer,in_to_middle))

out_layer = tf.nn.sigmoid(tf.matmul(middle_layer,middle_to_out))

out_error = (out_expected - out_layer)

loss = tf.reduce_sum(tf.square(out_error))




train_step = tf.train.GradientDescentOptimizer(0.8).minimize(loss)

sess = tf.InteractiveSession()

tf.global_variables_initializer().run()

print("starting to train")
for i in range (100000):
    sess.run(train_step, feed_dict = {
       in_layer : [inputs[i % inputs.__len__()]],
       out_expected : [expected[i % expected.__len__()]]
        })
    if (i%10000 == 0):
        print("step %d" % i)
print("done training")


#print("in: %s out: %s loss: %s"%(curr_in, curr_out, curr_loss))                 

def toColor(out):    
    return 'red' if (out[1] > out[0]) else 'black' #(diff,diff,diff)

def toColor2(out):    
    return 'orange' if (out[1] > out[0]) else 'blue' #(diff,diff,diff)

#for i in range (inputs.__len__()):
 #   plt.plot(inputs[i][0], inputs[i][1], color = toColor(expected[i]),marker='o')
print("----- DATA FROM TRAINING SET ---------") 
for i in range (inputs.__len__()):
    curr_in, curr_out, curr_expected, curr_loss = sess.run([in_layer, out_layer, out_expected, loss],
                                         feed_dict = {
                                        in_layer : [inputs[i]], 
                                        out_expected : [expected[i]]
                                        })
    print("----\nin: %s \nout: %s \nexpected %s \nloss: %s"%(curr_in, curr_out, curr_expected, curr_loss))
    plt.plot(curr_in[0][0], curr_in[0][1], color=toColor(curr_out[0]),marker='o' )

print("----- DATA NOT IN  TRAINING SET ---------") 
for i in range (test_inputs.__len__()):
    curr_in, curr_out, curr_expected, curr_loss = sess.run([in_layer, out_layer, out_expected, loss],
                                         feed_dict = {
                                        in_layer : [test_inputs[i]], 
                                        out_expected : [test_expected[i]]
                                        })
    print("----\nin: %s \nout: %s \nexpected %s \nloss: %s"%(curr_in, curr_out, curr_expected, curr_loss))    


    plt.plot(curr_in[0][0], curr_in[0][1], color=toColor2(curr_out[0]),marker='o' )
plt.axis((0,1,0,1))
plt.show()

1 个答案:

答案 0 :(得分:0)

首先,你的输出是二进制的,所以你应该对输出层的输出进行舍入

out_layer = tf.round(tf.nn.sigmoid(...))

其次,您的损失会收敛吗?如果是,并且您仍然会收到不良结果,请尝试降低学习率(例如,降至0.1),看看会发生什么。