Keras - 嵌入层的输入形状

时间:2017-07-14 20:28:25

标签: python machine-learning neural-network keras

我试图在Keras中使用如下所示的层实现卷积自动编码器。我的数据有1108行和29430列。

def build(features, embedding_dims, maxlen, filters, kernel_size):
    m = keras.models.Sequential()

    m.add(Embedding(features, embedding_dims, input_length=maxlen))
    m.add(Dropout(0.2))

    m.add(Conv1D(filters, kernel_size, padding='valid', activation='relu', strides=1, input_shape=(len(xx), features) ))
    m.add(MaxPooling1D())

    m.add(Conv1D(filters, kernel_size, padding='valid', activation='relu', strides=1, input_shape=(None, len(xx), features) ))
    m.add(UpSampling1D())

    m.summary()
    m.compile(optimizer="adagrad", loss='mse', metrics=['accuracy'])
    return m

early = keras.callbacks.EarlyStopping(
        monitor='val_loss', patience=10, verbose=1, mode='min')

model = build(len(xx[0]), 60, 11900, 70, 3)

model.fit(xx, xx, batch_size=4000, nb_epoch=10000,validation_split=0.1, 
    callbacks=[early])

但是,我收到一条错误消息,指出ValueError: Error when checking input: expected embedding_1_input to have shape (None, 11900) but got array with shape (1108, 29430)。为什么第一层会期望(None,maxlen)而不是数据的大小?

我还将包括我的模型摘要:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 11900, 60)         714000    
_________________________________________________________________
dropout_1 (Dropout)          (None, 11900, 60)         0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 11898, 70)         12670     
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 5949, 70)          0         
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 5947, 70)          14770     
_________________________________________________________________
up_sampling1d_1 (UpSampling1 (None, 11894, 70)         0         
=================================================================
Total params: 741,440
Trainable params: 741,440
Non-trainable params: 0
_________________________________________________________________

2 个答案:

答案 0 :(得分:1)

您对嵌入图层的输入必须是一维的,因此您需要将数据重塑为此格式(,n)。无论你传入input_length是什么,都需要匹配n个大小。

答案 1 :(得分:1)

我通过向嵌入层添加input_shape字段来修复此特定错误,如下所示:

m.add(Embedding(features, embedding_dims, input_length=maxlen, input_shape=(features, ) ))

features是要素数量(29430)。