TensorFlow的Estimator以低CPU使用率冻结

时间:2017-02-08 22:37:39

标签: tensorflow

我将我的TF更新为v1.0rc1,Estimator.evaluate不再起作用,因为它冻结在Restoring model...。我试图重现这个问题,下面的示例代码将使TF冻结,CPU使用率为220%(2CPU),根本没有输出。知道为什么会这样吗?谢谢!

import tensorflow as tf
from tensorflow.contrib.layers.python.layers.optimizers import optimize_loss
from tensorflow.contrib.learn.python.learn.estimators import model_fn
from tensorflow.contrib.learn.python.learn.estimators.estimator import Estimator
from tensorflow.python.framework import ops


def main(_):
    def func(features, targets, mode, params):
        idx = tf.concat([features['a'], features['b']], axis=1)

        embedding = tf.get_variable("embed", [10, 20], dtype=tf.float32)

        pred = tf.reduce_sum(tf.nn.embedding_lookup(embedding, idx))

        train_op = optimize_loss(loss=pred,
                                 global_step=tf.train.get_global_step(),
                                 learning_rate=0.001,
                                 optimizer='Adam',
                                 variables=tf.trainable_variables(),
                                 name="training_loss_optimizer")

        eval_metric_dict = dict()
        eval_metric_dict['metric'] = pred

        return model_fn.ModelFnOps(mode=mode,
                                   predictions=pred,
                                   loss=pred,
                                   train_op=train_op,
                                   eval_metric_ops=eval_metric_dict)

    model = Estimator(func, params={})

    model.fit(
        input_fn=lambda: (
            {'a': ops.convert_to_tensor([[1, 2, 3, 4, 5]]), 'b': ops.convert_to_tensor([[2, 3, 4, 3, 5]])},
            None), steps=1)
    model.evaluate(
        input_fn=lambda: (
            {'a': ops.convert_to_tensor([[1, 2, 3, 4, 5]]), 'b': ops.convert_to_tensor([[2, 3, 4, 3, 5]])},
            None))


if __name__ == "__main__":
    tf.app.run()

1 个答案:

答案 0 :(得分:1)

默认情况下,Estimator.evaluate采用基于队列的输入,并将继续进行评估,直到输入管道耗尽为止。当没有基于队列的输入时,这意味着它将永远循环。修复很简单:只需向steps提供evaluate参数。