https://www.tensorflow.org/get_started/get_started上的tensorflow教程有一个估算器示例,他们正在创建线性回归模型,如下所示:
import tensorflow as tf
import numpy as np
feature_columns = [tf.feature_column.numeric_column("x", shape=[1])]
estimator = tf.estimator.LinearRegressor(feature_columns=feature_columns)
x_train = np.array([1., 2., 3., 4.])
y_train = np.array([0, -1., -2., -3.])
x_eval = np.array([2., 5., 8., 1.])
y_eval = np.array([-1.01, -4.1, -7, 0.])
input_fn = tf.estimator.inputs.numpy_input_fn({"x":x_train}, y_train, batch_size = 4, num_epochs=None, shuffle=True)
train_input_fn = tf.estimator.inputs.numpy_input_fn({"x":x_train}, y_train, batch_size = 4, num_epochs=1000, shuffle=False)
eval_input_fn = tf.estimator.inputs.numpy_input_fn({"x":x_eval}, y_eval, batch_size = 4, num_epochs=1000, shuffle = False)
estimator.train(input_fn=input_fn, steps=1000)
train_metrics = estimator.evaluate(input_fn = train_input_fn)
eval_metrics = estimator.evaluate(input_fn = eval_input_fn)
print("train metrics: %r"% train_metrics)
print("eval metrics: %r"% eval_metrics)
我的问题是'train_input_fn'和'eval_input_fn'为什么我们需要选择'num_epochs = 1000'?
这些是具有不同'num_epochs'值的输出:
num_epochs = 1000
train metrics: {'global_step': 1000, 'loss': 4.3708383e-08, 'average_loss': 1.0927096e-08}
eval metrics: {'global_step': 1000, 'loss': 0.010135064, 'average_loss': 0.002533766}
num_epochs = 1
train metrics: {'global_step': 1000, 'loss': 9.6500253e-07, 'average_loss': 2.4125063e-07}
eval metrics: {'global_step': 1000, 'loss': 0.010293347, 'average_loss': 0.0025733367}
当num_epochs = 1时,我期待'loss'和'average_loss'的值相同。有人能帮助我理解这个吗?
感谢。
答案 0 :(得分:0)
你的直觉是正确的:
answers.insert(0, "down")
或
0.002533766 == 0.010293347 / 4
或
0.002533766 == 0.010293347 / x_eval.shape[0]
然而,我也很难理解为什么选择多个时代用于评估是有意义的,尤其是。如果average_loss == epoch_loss / x_eval.shape[0]
。也许他们只是试图表明函数x_eval.shape[0] == batch_size
接受tf.estimator.inputs.numpy_input_fn
作为参数。