Keras中的尺寸错误

时间:2018-03-22 20:46:53

标签: python neural-network nlp keras

我想要实现一个简单的word2vec模型,但是我收到以下错误

ValueError: Error when checking target: expected dense-softmax to have 3 dimensions, but got array with shape (32, 14).

变量train_xtrain_y是32行

[[0 0 0 0 0 0 0 0 0 1 0 0 0 0]
 [0 0 0 0 1 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 1 0 0 0 0 0 0 0 0 0]
                          ...]]

,python代码如下

vocal_size = 14
input = Input(shape=(vocal_size, ), dtype='int32', name='input')
embeddings = Embedding(output_dim=5, input_dim= vocal_size)(input)
output = Dense(vocal_size, use_bias=False, activation='softmax')(embeddings)
model = Model(input=input, output=output)
model.compile(optimizer='adam', loss='categorical_crossentropy')
model.summary()
model.fit(train_x, train_y)



_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           (None, 14)                0         
_________________________________________________________________
embeddings (Embedding)       (None, 14, 5)             70        
_________________________________________________________________
dense_1 (Dense)              (None, 14, 14)            70        
=================================================================
Total params: 140
Trainable params: 140
Non-trainable params: 0

修改

(“我喜欢stackoverflow”)上下文大小为1,我创建了以下元组,
(“I”,“like”),(“like”,“I”),(“like”,“stackoverflow”),(“stackoverflow”,“like”)

然后我对所有这些进行一次热编码并将它们提供给模型。

train_x [0] - >是“I”字的热门编码 train_y [0] - >是上下文单词“like”的热门编码

编辑2

使用skip-gram的第一个编码: 将0作为特殊单词处理(即不是最常见的10.000)并从1开始计数。 我假设我应该输入一个数字并输出一个热门编码,即(“堆栈”,“溢出”),输入[3](“堆栈”)和输出[0,0,0,0,1,0,0,0,0,0,0](“溢出”)。

Input(shape=(1,)..) -> 
Embedding(output_dim=embedding_size, input_dim=vocab_size, mask_zero=True, ...) -> 
Dense(vocab_size+1, activation="Softmax")
model.compile(optimizer='SGD', loss='categorical_crossentropy')

即。 embedding_size = 5,输入你的例子中的句子,

https://imgur.com/a/32m4z

1 个答案:

答案 0 :(得分:0)

感谢您的编辑。你遇到麻烦有两个原因,一个浅,一个深。第一:浅,致密层需要三维输入,但嵌入是二维的。您可以使用Flatten

解决此问题
input = Input(shape=(vocal_size, ), dtype='int32', name='input')
embeddings = Embedding(output_dim=5, input_dim=vocal_size+1, input_length=vocal_size)(input)
flat = Flatten(embeddings)
output = Dense(vocal_size, use_bias=False, activation='softmax')(flat)

深度是因为单热编码和嵌入是两个用于相同目的的选项,因此您不需要两者(请参阅herehere)。

嵌入层需要一系列由表示单词(或元组)和词汇量大小的整数组成的“句子”,所以类似

['Welcome to stack overflow',
'stack overflow is great',
'Hope it's helpful to you']

将表示为

[[1,2,3,4,0],[3,4,5,6,0],[7,8,9,2,10]] 
# 0s are there to "pad" sentences 1 & 2 as they all need to be the same length

并输入这样的嵌入层:

input = Input(shape=(5, ), dtype='int32')
embeddings = Embedding(output_dim=5, input_dim=11, input_length=5)(input)
#input dim is 11 because we want 1 more than the number of words in our vocabulary
#padding can be done with the keras function pad_sequences

我确定你知道,我们句子的一个热门编码看起来像这样:

[[1,1,1,1,0,0,0,0,0,0],
 [0,0,1,1,1,1,0,0,0,0],
 [0,1,0,0,0,0,1,1,1,1]]

因为句子已经被转换(一个热门已经将我们的句子“嵌入”作为10维空间中的二元向量),我们可以直接将其提供给Dense层而无需进一步嵌入:< / p>

input = Input(shape=(vocal_size, ), dtype='int32', name='input')
output = Dense(vocal_size, use_bias=False, activation='softmax')(input)

这是一个使用两种方式的功能性玩具示例:

from keras.layers import Dense,Activation,Embedding,Input,Flatten
from keras import Model
import numpy as np

wrords = ['Welcome to stack overflow',
    'stack overflow is great',
    'Hope it\'s helpful to you']

a = [[1,2,3,4,0],[3,4,5,6,0],[7,8,9,2,10]]
b = [[1,1,1,1,0,0,0,0,0,0],
 [0,0,1,1,1,1,0,0,0,0],
 [0,1,0,0,0,0,1,1,1,1]]
c = [1,1,0] #hypothetical target is "references stack overflow"

input = Input(shape=(5, ), dtype='int32', name='input')
embeddings = Embedding(output_dim=5, input_dim=11, input_length=5)(input)
flat = Flatten()(embeddings)
output = Dense(1, activation='softmax')(flat)
model = Model(input=input, output=output)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.summary()
model.fit(np.array(a),np.array(c))

input2 = Input(shape=(10, ), dtype='float32')
output2 = Dense(1, activation='softmax')(input2)
model2 = Model(input=input2, output=output2)
model2.compile(optimizer='adam', loss='binary_crossentropy')
model2.summary()
model2.fit(np.array(b),np.array(c))