在TensorFlow中获取一个简单的MLP来模拟XOR

时间:2015-11-29 01:27:33

标签: python machine-learning neural-network xor tensorflow

我尝试构建一个带有输入层(2个神经元),隐藏层(5个神经元)和输出层(1个神经元)的简单MLP。我计划使用[[0., 0.], [0., 1.], [1., 0.], [1., 1.]]训练和提供[0., 1., 1., 0.],以获得import tensorflow as tf ##################### # preparation stuff # ##################### # define input and output data input_data = [[0., 0.], [0., 1.], [1., 0.], [1., 1.]] # XOR input output_data = [0., 1., 1., 0.] # XOR output # create a placeholder for the input # None indicates a variable batch size for the input # one input's dimension is [1, 2] n_input = tf.placeholder(tf.float32, shape=[None, 2]) # number of neurons in the hidden layer hidden_nodes = 5 ################ # hidden layer # ################ b_hidden = tf.Variable(0.1) # hidden layer's bias neuron W_hidden = tf.Variable(tf.random_uniform([hidden_nodes, 2], -1.0, 1.0)) # hidden layer's weight matrix # initialized with a uniform distribution hidden = tf.sigmoid(tf.matmul(W_hidden, n_input) + b_hidden) # calc hidden layer's activation ################ # output layer # ################ W_output = tf.Variable(tf.random_uniform([hidden_nodes, 1], -1.0, 1.0)) # output layer's weight matrix output = tf.sigmoid(tf.matmul(W_output, hidden)) # calc output layer's activation ############ # learning # ############ cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(output, n_input) # calc cross entropy between current # output and desired output loss = tf.reduce_mean(cross_entropy) # mean the cross_entropy optimizer = tf.train.GradientDescentOptimizer(0.1) # take a gradient descent for optimizing with a "stepsize" of 0.1 train = optimizer.minimize(loss) # let the optimizer train #################### # initialize graph # #################### init = tf.initialize_all_variables() sess = tf.Session() # create the session and therefore the graph sess.run(init) # initialize all variables # train the network for epoch in xrange(0, 201): sess.run(train) # run the training operation if epoch % 20 == 0: print("step: {:>3} | W: {} | b: {}".format(epoch, sess.run(W_hidden), sess.run(b_hidden))) (元素)的所需输出。

不幸的是我的代码拒绝运行。无论我尝试什么,我都会遇到维度错误。相当令人沮丧:/我想我错过了一些东西,但我无法弄清楚是什么问题。

为了更好的可读性,我还将代码上传到了一个pastebin:code

有什么想法吗?

hidden = tf.sigmoid(tf.matmul(n_input, W_hidden) + b_hidden)

编辑:我仍然收到错误:/

line 27 (...) ValueError: Dimensions Dimension(2) and Dimension(5) are not compatible

输出hidden = tf.sigmoid(tf.matmul(W_hidden, n_input) + b_hidden) 。将行改为:

output = tf.sigmoid(tf.matmul(hidden, W_output))

似乎有效,但错误出现在:

line 34 (...) ValueError: Dimensions Dimension(2) and Dimension(5) are not compatible
告诉我:output = tf.sigmoid(tf.matmul(W_output, hidden))

将陈述转为:

line 34 (...) ValueError: Dimensions Dimension(1) and Dimension(5) are not compatible

也会抛出异常:hidden

EDIT2:我真的不明白这一点。不应该W_hidden x n_input.T(5, 2) x (2, 1),因为在维度上这将是n_input?如果我转置output隐藏仍然有效(我甚至不明白为什么它在没有转置的情况下工作)。但是,(1, 5) x (5, 1)会不断出现错误,但维度中的此操作应为 <asp:DropDownList ID="DropDownList1" runat="server" ></asp:DropDownList> ?!

1 个答案:

答案 0 :(得分:2)

(0)包含错误输出很有帮助 - 它也是一个有用的东西,因为它确实可以确定你遇到形状问题的确切位置。

(1)形状错误的产生是因为你在两个matmul中都有向后matmul的参数,并且向后有tf.Variable。一般规则是,具有System.out.println(tp.getText().length());//tp is a JTextPane. prints out 6, just to show I'm not going out of bounds System.out.println(position+ "-" + (position+ 1));//prints out 4 and 5 tp.getStyledDocument().remove(position, (position + 1));//crashes here, trying to remove "XS" from "1234XS" 的图层的权重应为input_size, output_size,而matmul应为[input_size, output_size](然后添加形状为tf.matmul(input_to_layer, weights_for_layer)的偏差)。

所以使用你的代码,

[output_size]

应该是:

W_hidden = tf.Variable(tf.random_uniform([hidden_nodes, 2], -1.0, 1.0))

W_hidden = tf.Variable(tf.random_uniform([2, hidden_nodes], -1.0, 1.0))

应为hidden = tf.sigmoid(tf.matmul(W_hidden, n_input) + b_hidden) ;和

tf.matmul(n_input, W_hidden)

应为output = tf.sigmoid(tf.matmul(W_output, hidden))

(2)一旦你修复了这些错误,你的运行需要输入feed_dict:

tf.matmul(hidden, W_output)

应该是:

sess.run(train)

至少,我认为这是你想要实现的目标。