我正在研究机器学习。 我尝试用神经网络进行softmax分类。 在学习进步中,标签1,2是良好的学习状态。 但在标签3处,成本输出始终为0.0。 我想我对神经网络并不完全了解。
我试图制作学习模型。 input_sequence_length = 3,output_class = 30< = input< = 2 result = 1
3< = input< = 5 result = 2
6< = input< = 8 result = 3
请让我知道我错过了什么。
以下源代码是部分的。
输入数据(0~2 - > 1,3~5 - > 2,6~8 - > 3)1,2,3 =标签
0 2 0 1
5 4 5 2
7 6 8 3
2 2 0 1
5 3 4 2
7 6 7 3
输出
1. input X : [[0, 2, 0]] Y(label) : [[1]]
cost : 1.25544 hypothesis : [0.30000001, 0.28, 0.41]
2. input X : [[5, 4, 5]] Y(label) : [[2]]
cost : 1.10084 hypothesis : [0.31, 0.36000001, 0.33000001]
3. input X : [[7, 6, 8]] Y(label) : [[3]]
cost : 0.0 hypothesis : [0.28, 0.25, 0.47999999]
4. input X : [[2, 2, 0]] Y(label) : [[1]]
cost : 1.22364 hypothesis : [0.27000001, 0.28999999, 0.44]
5. input X : [[5, 3, 4]] Y(label) : [[2]]
cost : 0.961203 hypothesis : [0.30000001, 0.31999999, 0.38]
6. input X : [[7, 6, 7]] Y(label) : [[3]]
cost : 0.0 hypothesis : [0.27000001, 0.23999999, 0.49000001]
源代码
batch_size = 1
input_sequence_length = 3
output_sequence_length = 1
input_num_classes = 9
output_num_classes = 3
hidden_size = 12
learning_rate = 0.1
with tf.name_scope("placeholder") as scope:
X = tf.placeholder(tf.int32, [None, input_sequence_length], name="x_input")
X_one_hot = tf.one_hot(X, input_num_classes)
Y = tf.placeholder(tf.int32, [None, output_sequence_length], name="y_input") # 1
Y_one_hot = tf.one_hot(Y, output_num_classes) # one hot
Y_one_hot = tf.reshape(Y_one_hot, [-1, output_num_classes])
X_one_hot = tf.reshape(X_one_hot, [batch_size , input_sequence_length * input_num_classes])
outputs = tf.to_float(X_one_hot)
with tf.name_scope("Layer_1") as scope:
W1 = tf.Variable(tf.random_normal([input_sequence_length * input_num_classes, **strong text**hidden_size]), name='weight1')
b1 = tf.Variable(tf.random_normal([hidden_size]), name='bias1')
outputs = tf.sigmoid(tf.matmul(outputs, W1) + b1)
with tf.name_scope("Layer_2") as scope:
W2 = tf.Variable(tf.random_normal([hidden_size, output_num_classes]), name='weight2')
b2 = tf.Variable(tf.random_normal([output_num_classes]), name='bias2')
logits = tf.sigmoid(tf.matmul(outputs, W2) + b2)
with tf.name_scope("hypothesis") as scope:
hypothesis = tf.nn.softmax(logits)
with tf.name_scope("cost") as scope:
# Cross entropy cost/loss
cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y_one_hot)
cost = tf.reduce_mean(cost_i)
with tf.name_scope("train") as scope:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
答案 0 :(得分:0)
tf.one_hot
的第一个参数(indices
)为零索引。
tf.one_hot(3,3)
将产生[0, 0, 0]
,导致交叉熵误差为0.