Question

我正在研究机器学习。我尝试用神经网络进行softmax分类。在学习进步中，标签1,2是良好的学习状态。但在标签3处，成本输出始终为0.0。我想我对神经网络并不完全了解。

我试图制作学习模型。 input_sequence_length = 3，output_class = 3

0＆lt; = input＆lt; = 2 result = 1

3＆lt; = input＆lt; = 5 result = 2

6＆lt; = input＆lt; = 8 result = 3

请让我知道我错过了什么。

以下源代码是部分的。

输入数据（0~2 - > 1,3~5 - > 2，6~8 - > 3）1,2,3 =标签

0   2   0   1
5   4   5   2
7   6   8   3
2   2   0   1
5   3   4   2
7   6   7   3

输出

1. input X : [[0, 2, 0]] Y(label) : [[1]]
cost : 1.25544  hypothesis : [0.30000001, 0.28, 0.41]

2. input X : [[5, 4, 5]] Y(label) : [[2]]
cost : 1.10084 hypothesis : [0.31, 0.36000001, 0.33000001]

3. input X : [[7, 6, 8]] Y(label) : [[3]]
cost : 0.0  hypothesis : [0.28, 0.25, 0.47999999]

4. input X : [[2, 2, 0]] Y(label) : [[1]]
cost : 1.22364  hypothesis : [0.27000001, 0.28999999, 0.44]

5. input X : [[5, 3, 4]] Y(label) : [[2]]
cost : 0.961203  hypothesis : [0.30000001, 0.31999999, 0.38]

6. input X : [[7, 6, 7]] Y(label) : [[3]]
cost : 0.0 hypothesis : [0.27000001, 0.23999999, 0.49000001]

源代码

batch_size = 1
input_sequence_length = 3 
output_sequence_length = 1
input_num_classes = 9
output_num_classes = 3
hidden_size = 12
learning_rate = 0.1


with tf.name_scope("placeholder") as scope:

    X = tf.placeholder(tf.int32, [None, input_sequence_length], name="x_input")
    X_one_hot = tf.one_hot(X, input_num_classes)

    Y = tf.placeholder(tf.int32, [None, output_sequence_length], name="y_input")  # 1
    Y_one_hot = tf.one_hot(Y, output_num_classes)  # one hot
    Y_one_hot = tf.reshape(Y_one_hot, [-1, output_num_classes])

    X_one_hot = tf.reshape(X_one_hot, [batch_size , input_sequence_length * input_num_classes])
    outputs = tf.to_float(X_one_hot)


with tf.name_scope("Layer_1") as scope:

    W1 = tf.Variable(tf.random_normal([input_sequence_length * input_num_classes, **strong text**hidden_size]), name='weight1')
    b1 = tf.Variable(tf.random_normal([hidden_size]), name='bias1')

    outputs = tf.sigmoid(tf.matmul(outputs, W1) + b1)


with tf.name_scope("Layer_2") as scope:

    W2 = tf.Variable(tf.random_normal([hidden_size, output_num_classes]), name='weight2')
    b2 = tf.Variable(tf.random_normal([output_num_classes]), name='bias2')

    logits = tf.sigmoid(tf.matmul(outputs, W2) + b2)


with tf.name_scope("hypothesis") as scope:

    hypothesis = tf.nn.softmax(logits)

with tf.name_scope("cost") as scope:

    # Cross entropy cost/loss
    cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y_one_hot)
    cost = tf.reduce_mean(cost_i)

with tf.name_scope("train") as scope:

    optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

Answer 1

tf.one_hot的第一个参数（indices）为零索引。

tf.one_hot(3,3)将产生[0, 0, 0]，导致交叉熵误差为0.

tensorflow：某些输入数据

1 个答案: