使用TFLearn的Trainer会导致双向RNN的递归错误

时间:2018-01-05 08:22:27

标签: tensorflow

我跟随example使用TFLearn的Trainer课程来训练自己未被TFLearn覆盖的模型。因此,我有这个代码:

import dataset_utils
import tensorflow as tf
import tflearn

from tensorflow.contrib import grid_rnn


def main(_):
    image_paths, labels = dataset_utils.read_dataset_list('../test/dummy_labels_file.txt')
    data_dir = "../test/dummy_data/"
    images = dataset_utils.read_images(data_dir=data_dir, image_paths=image_paths, image_extension='png')
    print('Done reading images')
    images = dataset_utils.resize(images, (1596, 48))
    images = dataset_utils.transpose(images)
    labels = dataset_utils.encode(labels)
    x_train, x_test, y_train, y_test = dataset_utils.split(features=images, test_size=0.5, labels=labels)
    y_train = dataset_utils.convert_to_sparse(y_train)
    y_test = dataset_utils.convert_to_sparse(y_test)

    with tf.Graph().as_default():
        X = tf.placeholder(tf.float32, [None, None, 48])
        Y = tf.sparse_placeholder(tf.int32)
        seq_lens = tf.placeholder(tf.int32, [None])

        def dnn(x):
            cell_fw = grid_rnn.Grid2LSTMCell(num_units=128)
            cell_bw = grid_rnn.Grid2LSTMCell(num_units=128)
            bidirectional_grid_rnn = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, x, dtype=tf.float32)
            outputs = tf.reshape(bidirectional_grid_rnn[0], [-1, 256])

            W = tf.Variable(tf.truncated_normal([256,
                                                 80],
                                                stddev=0.1, dtype=tf.float32), name='W')
            b = tf.Variable(tf.constant(0., dtype=tf.float32, shape=[80], name='b'))

            logits = tf.matmul(outputs, W) + b
            logits = tf.reshape(logits, [tf.shape(x)[0], -1, 80])
            logits = tf.transpose(logits, (1, 0, 2))
            return logits

        net = dnn(X)
        decoded, _ = tf.nn.ctc_beam_search_decoder(net, seq_lens, merge_repeated=False)
        cost = tf.reduce_mean(tf.nn.ctc_loss(inputs=net, labels=Y, sequence_length=seq_lens))
        optimizer = tf.train.MomentumOptimizer(learning_rate=0.001, momentum=0.5)
        label_error_rate = tf.reduce_mean(tf.edit_distance(tf.cast(decoded[0], tf.int32), Y))

        train_op = tflearn.TrainOp(loss=cost, optimizer=optimizer, metric=label_error_rate, batch_size=1)
        trainer = tflearn.Trainer(train_ops=train_op, tensorboard_verbose=0)

        trainer.fit({X: x_train, Y: y_train, seq_lens: dataset_utils.get_seq_lens(x_train)},
                    val_feed_dicts={X: x_test, Y: y_test, seq_lens: dataset_utils.get_seq_lens(x_test)},
                    n_epoch=1,
                    show_metric=True)


if __name__ == '__main__':
    tf.app.run(main=main)

不幸的是,当我运行代码时,它会抛出一个RecursionError,我想知道如何修复它。要重现它,只需克隆此repository运行train_using_tflearn_trainer.py。

1 个答案:

答案 0 :(得分:0)

我已经解决了递归错误。显然,label_error_rate指标导致错误,所以我删除了那个错误。现在我留下了这个错误:

Tensorflow TypeError: only integer scalar arrays can be converted to a scalar index