Tensorflow2.X:稀疏分类交叉熵误差问题

时间:2020-08-10 18:54:02

标签: python numpy tensorflow machine-learning keras

我一直在Tensorflow中尝试NLP。我正在尝试对给定单词的单词进行预测。例如input = stopoutput =前进。但是,与文本生成技术不同,我正在手动尝试将单词拆分成字符并将其向量用作特征和目标。

我将每个单词的所有字母映射到其对应的ID号,例如a1b2,等等,并转换了我的输入和输出以及添加的填充序列

例如:

  • xgo = ['g', 'o']等同于[7, 15]添加填充= [0, 0, 0, 0, 0, 0, 0, 0, 7, 15]
  • ystop = ['s', 't', 'o', 'p']等同于[19, 15, 20, 16]添加填充= [0, 0, 0, 0, 0, 0, 19, 15, 20, 16]

我为此创建了一个非常简单的前馈神经网络(见TF的基本页面)

# Initialize the FFNN model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(10,)),
  tf.keras.layers.Dense(512, activation='relu'),
  tf.keras.layers.Dense(256, activation='relu'),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

predictions = model(x_train[:1]).numpy()
tf.nn.softmax(predictions).numpy()

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()

但是,使用loss_fn(y_train[:1], predictions).numpy()时会导致此错误(简短的FYI,y_train[:1]的形状为(1, 10)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-287-fec370e5011d> in <module>()
----> 1 loss_fn(y_train[:1], predictions).numpy()

12 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py in sparse_softmax_cross_entropy_with_logits(_sentinel, labels, logits, name)
   4089                        "should equal the shape of logits except for the last "
   4090                        "dimension (received %s)." % (labels_static_shape,
-> 4091                                                      logits.get_shape()))
   4092     # Check if no reshapes are required.
   4093     if logits.get_shape().ndims == 2:

ValueError: Shape mismatch: The shape of labels (received (10,)) should equal the shape of logits except for the last dimension (received (1, 10)).

为什么?对此有什么解决方案?

使其可复制

import numpy as np
import tensorflow as tf

x_train = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 7, 15]], dtype=np.int32)
x_test = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 7, 15]], dtype=np.int32)
y_train = np.array([[0, 0, 0, 0, 0, 0, 19, 15, 20, 16]], dtype=np.int32)
y_test = np.array([[0, 0, 0, 0, 0, 0, 19, 15, 20, 16]], dtype=np.int32)

# Initialize the FFNN model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(10,)),
  tf.keras.layers.Dense(512, activation='relu'),
  tf.keras.layers.Dense(256, activation='relu'),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

predictions = model(x_train[:1]).numpy()
tf.nn.softmax(predictions).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()

0 个答案:

没有答案