Question

我的模型有100 000个图像训练样本，如何修改下面的代码以批量训练它？使用model.fit_generator我必须在生成器函数中指定它：

def data_generator(descriptions, features, n_step, max_sequence):
    # loop until we finish training
    while 1:
        # loop over photo identifiers in the dataset
        for i in range(0, len(descriptions), n_step):
            Ximages, XSeq, y = list(), list(),list()
            for j in range(i, min(len(descriptions), i+n_step)):
                image = features[j]
                # retrieve text input
                desc = descriptions[j]
                # generate input-output pairs
                in_img, in_seq, out_word = preprocess_data([desc], [image], max_sequence)
                for k in range(len(in_img)):
                    Ximages.append(in_img[k])
                    XSeq.append(in_seq[k])
                    y.append(out_word[k])
            # yield this batch of samples to the model
            yield [[array(Ximages), array(XSeq)], array(y)]

我的model.fit_generator代码：

model.fit_generator(data_generator(texts, train_features, 1, 150), 
                    steps_per_epoch=1500, epochs=50, callbacks=callbacks_list, verbose=1)

任何帮助都会很棒，我正在使用16GB V100特斯拉云计算培训

编辑：我的图像字幕模型为DSL中的每个令牌（250个令牌）创建一个训练样本。有了一个包含50张图像（相当于12500个训练样本）的数据集，并且批次大小为1，我得到了一个OOM。大约有32个（相当于8000个样本，批处理数量为1，它训练得很好。）我的问题是我可以更好地优化代码，还是使用多个GPU的唯一选择？

修复：

Steps_per_epoch必须等于ceil（num_samples / batch_size），因此，如果数据集包含1500个样本，steps_per_epoch应该等于1500。我还将LSTM滑动窗口从48减少到24

steps_per_epoch：整数。总步骤数（一批样品）在声明一个纪元完成之前从发生器屈服，并且开始下一个时代。通常应等于 ceil（num_samples / batch_size）。序列的可选：如果未指定，将使用len（generator）作为许多步骤。

Answer 1

生成器已经返回批次。

每个getResultByRouteParamId(route: ActivatedRouteSnapshot): Observable<Result> { return this.rs.getResult(this.auth.token, route.params['id']); } forkJoinQuizCategoriesAndAccount( result: Result ): Observable<[Category[], Account]> { return forkJoin( this.quizs.getCategoriesQuiz(this.auth.token, result.quizId), this.accs.getAccount(result.userId) ); } forkJoinUserDetailsAndAnswers(results: [Category[], Account]) { return forkJoin( this.accs.getUserDetails(results[1].access_token), this.as.getAnswers(this.auth.token) ); } resolve( route: ActivatedRouteSnapshot, state: RouterStateSnapshot ): Observable<UserResult> { const questionAnswers = Array<QuestionAnswer>(); let result: Result; let res: [Category[], Account]; let res2: [User, Answer[]]; return this.getResultByRouteParamId(route).pipe( tap(resu => result = resu), switchMap((result: Result) => this.forkJoinQuizCategoriesAndAccount(result)), tap(results => (res = results)), switchMap(results => this.forkJoinUserDetailsAndAnswers(results)), tap(results2 => (res2 = results2)), switchMap( // Stuck here! res[0] .forEach(cat => { this.cs.getQuestionsCategory(this.auth.token, cat.id); }) .map(questions => res2[1] .filter(ans => ans.userId === res[1].uid) .forEach(a => { const question = questions.find( (q: Question) => q.id === a.questionId ); if (!isNullOrUndefined(question)) { const category = res[0].find( (c: Category) => c.id === a.categoryId ); const qa = new QuestionAnswer(question, a); qa.category = category.name; questionAnswers.push(qa); } } // let ur = new UserResult(res2[1], result) // ur.questionAnswers = questionAnswers; // return ur; ) ) ) );是一批。您完全可以根据自己的需要来设计批处理发生器。

在您的代码中，批处理大小为yield。

Answer 2

这是使用生成器的正确方法：制作一个生成单个基准的生成器。从中创建一个Dataset，然后在该对象上使用batch方法。调整参数以查找不会引起OOM的最大批次大小。

def data_generator(descriptions, features, max_sequence):
    def _gen():
        for img, seq, word in zip(*preprocess_data(descriptions, features, max_sequence)):
            yield {'image': in_img, 'seq': seq}, wo
    return _gen    


ds = tf.data.Dataset.from_generator(
    data_generator(descriptions, features, max_sequence),
    output_types=({'image': tf.float32, 'seq': tf.float32}, tf.int32),
    output_shapes=({
            'image': tf.TensorShape([blah, blah]),
            'seq': tf.TensorShape([blah, blah]),
        },
        tf.TensorShape([balh])
    )
)

ds = ds.batch(n_step)

使用fit_generator批量训练模型

2 个答案: