我正在尝试构建一个卷积神经网络,以便将一组新闻分为3类。这个想法是以每分钟最多10个新闻的块为单位收集所有新闻语料库,每个新闻的最大数量为30;所以每个样品应该由一个尺寸张量组成(样品,10,30)。 这是嵌入操作的输入,输出一个大小的张量(样本,10,30,200),其中200是每个单词的嵌入维度。
我将此嵌入式输入传递给卷积操作,我将其展平,然后将其传递到Dense层以获得最终输出(3个标签)。该模型如下:
news_inputs = Input(shape=(n_news_per_min, n_words_per_news, ), name='news_per_min')
news_inputs_embeddings = Embedding(input_dim=vocab_size,
output_dim=embedding_dim,
input_length=n_words_per_news,
weights = [embedding_weights],
trainable=False)(news_inputs)
conv = Conv2D(32, 3, padding='same')(news_inputs_embeddings)
flat = Flatten()(conv)
dense = Dense(16, activation = 'relu')(flat)
out = Dense(3, activation='softmax')(dense)
model = Model(inputs=news_inputs, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics = ['accuracy'])
深入了解创建的对象提供以下内容:
news_inputs:
<tf.Tensor 'news_per_min_9:0' shape=(?, 10, 30) dtype=float32>
news_inputs_embeddings:
<tf.Tensor 'embedding_10/Gather:0' shape=(?, 10, 30, 200) dtype=float32>
conv:
<tf.Tensor 'conv2d_11/BiasAdd:0' shape=(?, 10, 30, 32) dtype=float32>
flat:
<tf.Tensor 'flatten_5/Reshape:0' shape=(?, ?) dtype=float32>
dense:
<tf.Tensor 'dense_11/Relu:0' shape=(?, 16) dtype=float32>
out:
<tf.Tensor 'dense_12/Softmax:0' shape=(?, 3) dtype=float32>
模型摘要如下:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
news_per_min (InputLayer) (None, 10, 30) 0
_________________________________________________________________
embedding_12 (Embedding) (None, 30, 200) 87766800
_________________________________________________________________
conv2d_13 (Conv2D) (None, 30, 32) 57632
_________________________________________________________________
flatten_7 (Flatten) (None, 960) 0
_________________________________________________________________
dense_15 (Dense) (None, 16) 15376
_________________________________________________________________
dense_16 (Dense) (None, 3) 51
=================================================================
Total params: 87,839,859
Trainable params: 73,059
Non-trainable params: 87,766,800
当我编译它时,所有东西似乎都能正常工作,但是拟合或预测都会产生这个错误:
model.fit(X_train, y_train, batch_size=16)
InvalidArgumentError: Matrix size-incompatible: In[0]: [16,9600], In[1]: [960,16]
[[Node: dense_15/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](flatten_7/Reshape, dense_15/kernel/read)]]
[[Node: loss_7/mul/_157 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_693_loss_7/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op 'dense_15/MatMul', defined at:
似乎第一个Dense图层(&#39; Dense_15&#39;)无法连接到扁平的复杂图层,但我无法理解为什么。
另外,尝试在卷积之后添加MaxPooling图层会产生错误(我在这里不包括它但是它说"IndexError: tuple index out of range"
所以我猜想conv层的尺寸有问题。
有什么帮助吗?