我一直在Tensorflow中尝试NLP。我正在尝试对给定单词的单词进行预测。例如input
= stop
,output
=前进。但是,与文本生成技术不同,我正在手动尝试将单词拆分成字符并将其向量用作特征和目标。
我将每个单词的所有字母映射到其对应的ID号,例如a
:1
,b
:2
,等等,并转换了我的输入和输出以及添加的填充序列
例如:
x
:go
= ['g', 'o']
等同于[7, 15]
添加填充= [0, 0, 0, 0, 0, 0, 0, 0, 7, 15]
y
:stop
= ['s', 't', 'o', 'p']
等同于[19, 15, 20, 16]
添加填充= [0, 0, 0, 0, 0, 0, 19, 15, 20, 16]
我为此创建了一个非常简单的前馈神经网络(见TF的基本页面)
# Initialize the FFNN model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(10,)),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
tf.nn.softmax(predictions).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()
但是,使用loss_fn(y_train[:1], predictions).numpy()
时会导致此错误(简短的FYI,y_train[:1]
的形状为(1, 10)
:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-287-fec370e5011d> in <module>()
----> 1 loss_fn(y_train[:1], predictions).numpy()
12 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_ops.py in sparse_softmax_cross_entropy_with_logits(_sentinel, labels, logits, name)
4089 "should equal the shape of logits except for the last "
4090 "dimension (received %s)." % (labels_static_shape,
-> 4091 logits.get_shape()))
4092 # Check if no reshapes are required.
4093 if logits.get_shape().ndims == 2:
ValueError: Shape mismatch: The shape of labels (received (10,)) should equal the shape of logits except for the last dimension (received (1, 10)).
为什么?对此有什么解决方案?
使其可复制:
import numpy as np
import tensorflow as tf
x_train = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 7, 15]], dtype=np.int32)
x_test = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 7, 15]], dtype=np.int32)
y_train = np.array([[0, 0, 0, 0, 0, 0, 19, 15, 20, 16]], dtype=np.int32)
y_test = np.array([[0, 0, 0, 0, 0, 0, 19, 15, 20, 16]], dtype=np.int32)
# Initialize the FFNN model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(10,)),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
tf.nn.softmax(predictions).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()