返回所有可能的预测值

时间:2017-05-05 09:14:05

标签: tensorflow neural-network

此神经网络在输入[[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]]上训练,标记为输出:[[0.0], [1.0], [1.0], [0.0]]

import numpy as np
import tensorflow as tf

sess = tf.InteractiveSession()
sess.run(init)
# a batch of inputs of 2 value each
inputs = tf.placeholder(tf.float32, shape=[None, 2])

# a batch of output of 1 value each
desired_outputs = tf.placeholder(tf.float32, shape=[None, 1])

# [!] define the number of hidden units in the first layer
HIDDEN_UNITS = 4 
weights_1 = tf.Variable(tf.truncated_normal([2, HIDDEN_UNITS]))

biases_1 = tf.Variable(tf.zeros([HIDDEN_UNITS]))

# connect 2 inputs to every hidden unit. Add bias
layer_1_outputs = tf.nn.sigmoid(tf.matmul(inputs, weights_1) + biases_1)

print layer_1_outputs

NUMBER_OUTPUT_NEURONS = 1

biases_2 = tf.Variable(tf.zeros([NUMBER_OUTPUT_NEURONS]))
weights_2 = tf.Variable(tf.truncated_normal([HIDDEN_UNITS, NUMBER_OUTPUT_NEURONS]))
finalLayerOutputs = tf.nn.sigmoid(tf.matmul(layer_1_outputs, weights_2) + biases_2)

tf.global_variables_initializer().run()

logits = tf.nn.sigmoid(tf.matmul(layer_1_outputs, weights_2) + biases_2)

training_inputs = [[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]]
training_outputs = [[0.0], [1.0], [1.0], [0.0]]

error_function = 0.5 * tf.reduce_sum(tf.sub(logits, desired_outputs) * tf.sub(logits, desired_outputs))
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(error_function)

for i in range(15):
    _, loss = sess.run([train_step, error_function],
                       feed_dict={inputs: np.array(training_inputs),
                                  desired_outputs: np.array(training_outputs)})

print(sess.run(logits, feed_dict={inputs: np.array([[0.0, 1.0]])}))

经过培训,此网络会返回值[[ 0.61094815]]

[[0.0, 1.0]]

[[ 0.61094815]]是训练此网络分配给输入值[[0.0,1.0]]后概率最高的值?是否可以访问较低概率值而不仅仅是最可能的值?

如果我增加训练时期的数量,我会得到更好的预测,但在这种情况下,我只想用他们对给定输入的概率来访问所有潜在值。

更新:

更新代码以使用softmax进行多类分类。但[[0.0, 1.0, 0.0, 0.0]]的预测是[array([0])]。我的更新是否正确?

import numpy as np
import tensorflow as tf

init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)
# a batch of inputs of 2 value each
inputs = tf.placeholder(tf.float32, shape=[None, 4])

# a batch of output of 1 value each
desired_outputs = tf.placeholder(tf.float32, shape=[None, 3])

# [!] define the number of hidden units in the first layer
HIDDEN_UNITS = 4 
weights_1 = tf.Variable(tf.truncated_normal([4, HIDDEN_UNITS]))

biases_1 = tf.Variable(tf.zeros([HIDDEN_UNITS]))

# connect 2 inputs to every hidden unit. Add bias
layer_1_outputs = tf.nn.softmax(tf.matmul(inputs, weights_1) + biases_1)

biases_2 = tf.Variable(tf.zeros([3]))
weights_2 = tf.Variable(tf.truncated_normal([HIDDEN_UNITS, 3]))
finalLayerOutputs = tf.nn.softmax(tf.matmul(layer_1_outputs, weights_2) + biases_2)

tf.global_variables_initializer().run()

logits = tf.nn.softmax(tf.matmul(layer_1_outputs, weights_2) + biases_2)

training_inputs = [[0.0, 0.0 , 0.0, 0.0], [0.0, 1.0 , 0.0, 0.0], [1.0, 0.0 , 0.0, 0.0], [1.0, 1.0 , 0.0, 0.0]]
training_outputs = [[0.0,0.0,0.0], [1.0,0.0,0.0], [1.0,0.0,0.0], [0.0,0.0,1.0]]

error_function = 0.5 * tf.reduce_sum(tf.sub(logits, desired_outputs) * tf.sub(logits, desired_outputs))
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(error_function)

for i in range(15):
    _, loss = sess.run([train_step, error_function],
                       feed_dict={inputs: np.array(training_inputs),
                                  desired_outputs: np.array(training_outputs)})

prediction=tf.argmax(logits,1)
best = sess.run([prediction],feed_dict={inputs: np.array([[0.0, 1.0, 0.0, 0.0]])})
print(best)

打印[array([0])]

更新2:

更换

prediction=tf.argmax(logits,1)
best = sess.run([prediction],feed_dict={inputs: np.array([[0.0, 1.0, 0.0, 0.0]])})
print(best)

使用:

prediction=tf.nn.softmax(logits)
best = sess.run([prediction],feed_dict={inputs: np.array([[0.0, 1.0, 0.0, 0.0]])})
print(best)

似乎可以解决问题。

所以现在完整的来源是:

import numpy as np
import tensorflow as tf

init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)
# a batch of inputs of 2 value each
inputs = tf.placeholder(tf.float32, shape=[None, 4])

# a batch of output of 1 value each
desired_outputs = tf.placeholder(tf.float32, shape=[None, 3])

# [!] define the number of hidden units in the first layer
HIDDEN_UNITS = 4 
weights_1 = tf.Variable(tf.truncated_normal([4, HIDDEN_UNITS]))

biases_1 = tf.Variable(tf.zeros([HIDDEN_UNITS]))

# connect 2 inputs to every hidden unit. Add bias
layer_1_outputs = tf.nn.softmax(tf.matmul(inputs, weights_1) + biases_1)

biases_2 = tf.Variable(tf.zeros([3]))
weights_2 = tf.Variable(tf.truncated_normal([HIDDEN_UNITS, 3]))
finalLayerOutputs = tf.nn.softmax(tf.matmul(layer_1_outputs, weights_2) + biases_2)

tf.global_variables_initializer().run()

logits = tf.nn.softmax(tf.matmul(layer_1_outputs, weights_2) + biases_2)

training_inputs = [[0.0, 0.0 , 0.0, 0.0], [0.0, 1.0 , 0.0, 0.0], [1.0, 0.0 , 0.0, 0.0], [1.0, 1.0 , 0.0, 0.0]]
training_outputs = [[0.0,0.0,0.0], [1.0,0.0,0.0], [1.0,0.0,0.0], [0.0,0.0,1.0]]

error_function = 0.5 * tf.reduce_sum(tf.sub(logits, desired_outputs) * tf.sub(logits, desired_outputs))
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(error_function)

for i in range(1500):
    _, loss = sess.run([train_step, error_function],
                       feed_dict={inputs: np.array(training_inputs),
                                  desired_outputs: np.array(training_outputs)})

prediction=tf.nn.softmax(logits)
best = sess.run([prediction],feed_dict={inputs: np.array([[0.0, 1.0, 0.0, 0.0]])})
print(best)

打印

[array([[ 0.49810624,  0.24845563,  0.25343812]], dtype=float32)]

1 个答案:

答案 0 :(得分:1)

您当前的网络执行(逻辑)回归,而不是真正的分类:给定输入x,它会尝试评估f(x)(此处f(x) = x1 XOR x2,但网络不知道在训练之前),这是回归。为此,它会学习一个函数f1(x),并尝试在所有训练样本上尽可能接近f(x)[[ 0.61094815]]只是f1([[0.0, 1.0]])的值。在这种情况下,没有“在课堂上的概率”,因为没有课。只有用户(你)选择将f1(x)解释为输出为1的概率。因为你只有2个类,它告诉你其他类的概率是1-0.61094815(也就是说,你正在对网络的输出进行分类,但它本身并没有真正训练过。)在某种程度上,这种用作分类的方法是一种(广泛使用的)技巧来执行分类,但只有在你有2个类时才有效。

用于分类的真实网络的构建方式略有不同:您的logits将是(batch_size, number_of_classes)形状 - 所以(1,2)在你的情况下 - 你对它们应用sofmax,然后预测是argmax(softmax),概率为max(softmax)。然后,您还可以根据网络probability(class i) = softmax[i]获取每个输出的概率。在这里,网络经过实际训练,可以了解x在每个班级中的概率。

如果我的解释模糊不清或者0和1之间的回归之间的区别和分类在2个类的设置中看起来是哲学的,我很抱歉,但是如果你添加更多的类,你可能会看到我的意思。

修改 回答你的2次更新。

  • 在您的训练样本中,标签(training_outputs)必须是概率分布,即每个样本必须有1个(99%的时间形式为(1,0,0),( 0,1,0)或(0,0,1)),因此您的第一个输出[0.0,0.0,0.0]无效。如果你想在两个第一个输入上学习XOR,那么第一个输出应该与最后一个输出相同:[0.0,0.0,1.0]。

  • prediction=tf.argmax(logits,1) = [array([0])]完全正常:logins包含您的概率,而prediction是预测,它是概率最大的类,在您的情况下为0级:在您的训练集中,[0.0, 1.0, 0.0, 0.0]与输出[1.0, 0.0, 0.0]相关联,即它的概率为0的0级,概率为0的其他类。经过充分的训练后,print(best)与输入prediction=tf.argmax(logits,1)上的[1.0, 1.0 , 0.0, 0.0]应该为您提供[array([2])],2是训练集中此输入的类的索引。