在字嵌入的卷积模型中,在kera中展平后的大小不兼容的密集层

时间:2018-04-27 22:39:54

标签: python tensorflow keras embedding

我正在尝试构建一个卷积神经网络,以便将一组新闻分为3类。这个想法是以每分钟最多10个新闻的块为单位收集所有新闻语料库,每个新闻的最大数量为30;所以每个样品应该由一个尺寸张量组成(样品,10,30)。 这是嵌入操作的输入,输出一个大小的张量(样本,10,30,200),其中200是每个单词的嵌入维度。

我将此嵌入式输入传递给卷积操作,我将其展平,然后将其传递到Dense层以获得最终输出(3个标签)。该模型如下:

news_inputs = Input(shape=(n_news_per_min, n_words_per_news, ), name='news_per_min')
news_inputs_embeddings = Embedding(input_dim=vocab_size, 
                                   output_dim=embedding_dim, 
                                   input_length=n_words_per_news,                                   
                                   weights = [embedding_weights],
                                   trainable=False)(news_inputs)
conv = Conv2D(32, 3, padding='same')(news_inputs_embeddings)
flat = Flatten()(conv)
dense = Dense(16, activation = 'relu')(flat)
out = Dense(3, activation='softmax')(dense)
model = Model(inputs=news_inputs, outputs=out)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics = ['accuracy'])

深入了解创建的对象提供以下内容:

news_inputs:
<tf.Tensor 'news_per_min_9:0' shape=(?, 10, 30) dtype=float32>

news_inputs_embeddings:
<tf.Tensor 'embedding_10/Gather:0' shape=(?, 10, 30, 200) dtype=float32>

conv:
<tf.Tensor 'conv2d_11/BiasAdd:0' shape=(?, 10, 30, 32) dtype=float32>

flat:
<tf.Tensor 'flatten_5/Reshape:0' shape=(?, ?) dtype=float32>

dense:
<tf.Tensor 'dense_11/Relu:0' shape=(?, 16) dtype=float32>

out:
<tf.Tensor 'dense_12/Softmax:0' shape=(?, 3) dtype=float32>

模型摘要如下:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
news_per_min (InputLayer)    (None, 10, 30)            0         
_________________________________________________________________
embedding_12 (Embedding)     (None, 30, 200)           87766800  
_________________________________________________________________
conv2d_13 (Conv2D)           (None, 30, 32)            57632     
_________________________________________________________________
flatten_7 (Flatten)          (None, 960)               0         
_________________________________________________________________
dense_15 (Dense)             (None, 16)                15376     
_________________________________________________________________
dense_16 (Dense)             (None, 3)                 51        
=================================================================
Total params: 87,839,859
Trainable params: 73,059
Non-trainable params: 87,766,800

当我编译它时,所有东西似乎都能正常工作,但是拟合或预测都会产生这个错误:

model.fit(X_train, y_train, batch_size=16)
InvalidArgumentError: Matrix size-incompatible: In[0]: [16,9600], In[1]: [960,16]
     [[Node: dense_15/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](flatten_7/Reshape, dense_15/kernel/read)]]
     [[Node: loss_7/mul/_157 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_693_loss_7/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op 'dense_15/MatMul', defined at:

似乎第一个Dense图层(&#39; Dense_15&#39;)无法连接到扁平的复杂图层,但我无法理解为什么。

另外,尝试在卷积之后添加MaxPooling图层会产生错误(我在这里不包括它但是它说"IndexError: tuple index out of range"所以我猜想conv层的尺寸有问题。 有什么帮助吗?

0 个答案:

没有答案