我怎样才能将不同长度的序列传递给keras上的LSTM?

时间:2017-07-28 09:21:24

标签: keras lstm keras-layer

我有一组X_train 744983个样本分为24443个序列,而每个序列中的样本数不同。每个样本是30维的向量。如何将这些数据输入Keras的LSTM? 以下是火车组的一些描述:

print(type(X_train))
print(np.shape(X_train))
print(type(X_train[0]))
print(np.shape(X_train[0]))

<class 'list'>
(24443, )
<class 'numpy.ndarray'>
(46, 30)

当我以这种方式设置参数时:

model = Sequential()
model.add(LSTM(4, input_shape = (30, ), return_sequences=True,))
model.add(Dense(1))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
model.fit(X_train, y_train, epochs=1, batch_size=1, verbose=2`)

错误是&#34;输入0与图层lstm_24不兼容:预期ndim = 3,找到ndim = 2&#34;

如果我将input_shape从(30,)更改为(None,30),则代码运行1分钟,在检查模型输入时出现错误&#39;错误:您传递给您的Numpy数组的列表模型不是模型预期的大小。预计会看到1个阵列,但却得到了以下24443阵列列表&#39;

此外,如果我在拟合之前将X_train更改为nparrays,则错误变为:预期lstm_26_input具有3个维度,但是具有形状的数组(24443,1)

我也尝试填充序列:

X_train = sequence.pad_sequences(X_train)
X_test = sequence.pad_sequences(X_test)

然而,它将我的输入转为&#39; 0&#39;,&#39; 1&#39;,&#39; -1&#39;到处..

#X_train = np.array(X_train)
#X_test = np.array(X_test)
print(X_train[0])
[[ 0  0  0 ...,  0  0  0]
 [ 0  0  0 ...,  0  0  0]
 [ 0  0  0 ...,  0  0  0]
 ..., 
 [ 0  0  0 ...,  0  1 -1]
 [ 0  0  0 ...,  0  1  0]
 [ 0  0  0 ...,  0  0  0]]

1 个答案:

答案 0 :(得分:0)

默认情况下,sequence.pad_sequences将数据转换为int32 dtype:

tf.keras.preprocessing.sequence.pad_sequences(
    sequences,
    maxlen=None,
    dtype='int32',  # problem is here
    padding='pre',
    truncating='pre',
    value=0.0
)

尝试将dtype更改为float32

X_train = sequence.pad_sequences(X_train, dtype='float32')
X_test = sequence.pad_sequences(X_test, dtype='float32')