Question

我需要使用具有不同序列长度的示例训练LSTM层的堆栈。如果使用Keras顺序模型，则可以实现以下代码。

model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=(None, 5)))
model.add(LSTM(8, return_sequences=True))
model.add(Dense(2, activation='sigmoid'))

def train_generator():
    while True:
        sequence_length = np.random.randint(10, 100)
        x_train = np.random.random((1000, sequence_length, 5))
        # y_train will depend on past 5 timesteps of x
        y_train = x_train[:, :, 0]
        for i in range(1, 5):
            y_train[:, i:] += x_train[:, :-i, i]
        y_train = to_categorical(y_train > 2.5)
        yield x_train, y_train

model1.fit_generator(train_generator(), steps_per_epoch=2, epochs=2, verbose=1)

以上内容基于我在另一个问题中搜索的内容： https://datascience.stackexchange.com/questions/26366/training-an-rnn-with-examples-of-different-lengths-in-keras。上面的代码工作正常，可以使用不同长度的示例来训练上面的模型。

但是，就我而言，我应该子类化tf.keras.Model而不是使用顺序模型。

class LSTMModel(tf.keras.Model):
    def __init__(self):
        super(LSTMModel, self).__init__()

        self._lstm_0 = LSTM(32, return_sequences=True, input_shape=(None, 5)) 
        self._lstm_1 = LSTM(8, return_sequences=True)
        self._dense = Dense(2, activation='sigmoid')

    def call(self, inputs, training=False):
        output = self._lstm_0(inputs)
        output = self._lstm_1(output)
        output = self._dense(output)

        return output

我期望第二个代码应该与第一个代码等效。但是，它崩溃并显示以下错误消息。

BaseCollectiveExecutor::StartAbort Invalid argument: Operation expected a list with 33 elements but got a list with 27 elements

有人可以告诉我原因并提供建议吗？

Answer 1

使用Tensorflow 1.14执行的所有操作

我运行以下代码以包含所有工具：

import numpy as np
from tensorflow.keras import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.utils import to_categorical

def train_generator():
    while True:
        sequence_length = np.random.randint(10, 100)
        x_train = np.random.random((1000, sequence_length, 5))
        # y_train will depend on past 5 timesteps of x
        y_train = x_train[:, :, 0]
        for i in range(1, 5):
            y_train[:, i:] += x_train[:, :-i, i]
        y_train = to_categorical(y_train > 2.5)
        yield x_train, y_train

那么第一个模型是：

model_1 = Sequential()
model_1.add(LSTM(32, return_sequences=True, input_shape=(None, 5)))
model_1.add(LSTM(8, return_sequences=True))
model_1.add(Dense(2, activation='sigmoid'))
model_1.compile(optimizer="adam", loss="mse")
model_1.fit_generator(train_generator(), steps_per_epoch=2, epochs=2, verbose=1)

第二个模型是：

class LSTMModel(Model):
    def __init__(self):
        super(LSTMModel, self).__init__()

        self._lstm_0 = LSTM(32, return_sequences=True, input_shape=(None, 5)) 
        self._lstm_1 = LSTM(8, return_sequences=True)
        self._dense = Dense(2, activation='sigmoid')

    def call(self, inputs, training=False):
        output = self._lstm_0(inputs)
        output = self._lstm_1(output)
        output = self._dense(output)

        return output

model_2 = LSTMModel()
model_2.compile(optimizer="adam", loss="mse")
model_2.fit_generator(train_generator(), steps_per_epoch=2, epochs=2, verbose=1)

结果相同。如果仍然无法通过，请在问题中提供更多信息，例如您正在运行的TF版本。

根据打开的Github issue，这是一个错误，已在tf-nightly-gpu-2.0-preview中解决。

使用Keras LSTM层处理具有不同序列长度的示例

1 个答案: