如何在fit_generator keras中的varibale输入长度中设置steps_per_epoch

时间:2019-08-10 20:30:57

标签: tensorflow keras lstm autoencoder seq2seq

我需要以这样的方式将输入数据输入模型,即长度相同的句子在同一批中(LSTM中可变的输入长度)。

我的问题是,当我们使用fit_generator时,我们需要指定steps_per_epoch , validation_steps,但就我而言,我不能仅仅通过num_train_steps = len(Xtrain) // BATCH_SIZE来实现。现在我的问题是,我在哪里可以计算出来并将其传递给fit_generator?我的句子生成器中有steps_per_epoch,但是我不知道如何将其传递给fit_generator

有什么方法可以返回sentence_generator中每个批次的长度?

这是fit_generator(我不知道如何实现num_train_steps并传递给fit_generator吗?)

lstm_ae_model.fit_generator(train_gen, val_gen, num_train_steps, num_val_steps, dir, NUM_EPOCHS=1)

所以我的自定义生成器是这样的,以防它可以帮助您

def sentence_generator(X, embeddings):
    while True:
        # loop once per epoch
        index_sentence = 0
        import itertools
        items = sorted(X.values(), key=len, reverse=True)
        for length, dics in itertools.groupby(items, len):
            # dics is all the nested dictionaries with this length
            a = 0
            for x in dics:
                a = a+1
            num_train_steps = a
            sent_wids = np.zeros([a, length])
            for temp_sentence in dics:
                keys_words = list(temp_sentence.keys())
                for index_word in range(len(keys_words)):
                    sent_wids[index_sentence, index_word] = lookup_word2id(keys_words[index_word])
                index_sentence = index_sentence + 1
                Xbatch = embeddings[sent_wids]
                yield Xbatch, Xbatch

1 个答案:

答案 0 :(得分:1)

您可以做的是首先制作一个函数,该函数通过迭代数据集并计算该值来将steps_per_epoch的值预先计算,然后将其传递给fit_generator。像这样:

def compute_steps(X):
    import itertools
    items = sorted(X.values(), key=len, reverse=True)
    count = 0
    for length, dics in itertools.groupby(items, len):
        count += 1

    return count

spe = compute_steps(...)
gen = sentence_generator(...)
model.fit_generator(gen, steps_per_epoch=spe)

并对验证数据进行类似操作。