我想在训练tensorflow.estimator.DNNRegressor模型时捕获日志中输出的每一步的损失值。
鉴于琐碎的废话示例:
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: import tensorflow as tf
In [4]: X = pd.DataFrame({'X1':np.arange(1,10), 'X2':np.arange(11,20)})
In [5]: X
Out[5]:
X1 X2
0 1 11
1 2 12
2 3 13
3 4 14
4 5 15
5 6 16
6 7 17
7 8 18
8 9 19
In [6]: y = pd.Series(np.arange(1,10)/2)
In [7]: feature_cols = [tf.feature_column.numeric_column(col)
for col in X.columns]
In [8]: regressor =
tf.estimator.DNNRegressor(feature_columns=feature_cols,
hidden_units=[3,3],
model_dir='mymodel')
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_tf_random_seed': 1,
'_save_checkpoints_secs': 600, '_log_step_count_steps': 100,
'_keep_checkpoint_max': 5, '_save_checkpoints_steps': None,
'_session_config': None, '_model_dir': 'mymodel',
'_keep_checkpoint_every_n_hours': 10000, '_save_summary_steps': 100}
In [9]: regressor.train(
input_fn=tf.estimator.inputs.pandas_input_fn(
x=X,
y=y,
num_epochs=None,
shuffle=True),
steps=1000)
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into mymodel/model.ckpt.
INFO:tensorflow:step = 1, loss = 1090.0
INFO:tensorflow:global_step/sec: 685.57
INFO:tensorflow:step = 101, loss = 580.524 (0.146 sec)
INFO:tensorflow:global_step/sec: 807.454
INFO:tensorflow:step = 201, loss = 423.964 (0.124 sec)
INFO:tensorflow:global_step/sec: 875.857
INFO:tensorflow:step = 301, loss = 353.421 (0.114 sec)
INFO:tensorflow:global_step/sec: 788.649
INFO:tensorflow:step = 401, loss = 297.249 (0.127 sec)
INFO:tensorflow:global_step/sec: 649.258
INFO:tensorflow:step = 501, loss = 254.237 (0.154 sec)
INFO:tensorflow:global_step/sec: 803.059
INFO:tensorflow:step = 601, loss = 303.544 (0.125 sec)
INFO:tensorflow:global_step/sec: 674.359
INFO:tensorflow:step = 701, loss = 234.27 (0.148 sec)
INFO:tensorflow:global_step/sec: 818.35
INFO:tensorflow:step = 801, loss = 259.353 (0.122 sec)
INFO:tensorflow:global_step/sec: 672.83
INFO:tensorflow:step = 901, loss = 208.319 (0.149 sec)
INFO:tensorflow:Saving checkpoints for 1000 into mymodel/model.ckpt.
INFO:tensorflow:Loss for final step: 200.45.
Out[9]: <tensorflow.python.estimator.canned.dnn.DNNRegressor at 0x1076d5470>
所以,在上面的输出中,我想捕获step = N和loss = value,这样我就可以绘制并进一步分析它。
感谢这里的任何帮助
答案 0 :(得分:0)
好的,所以我没有找到一种方法来检索训练损失值但是,我确实找到了一种方法来捕捉和绘制从evaluate()
函数返回的损失函数值,这几乎让我感到高兴我需要确定的信息是否过度拟合我的数据。
与上面的琐碎例子保持一致,就像这样。
In [1]: import tensorflow as tf
In [2]: from sklearn.model_selection import train_test_split
In [3]: import pandas as pd
In [4]: import numpy as np
In [5]: X = pd.DataFrame({'X1':np.arange(1,10), 'X2':np.arange(11,20)})
In [6]: y = pd.Series(np.arange(1,10)/2)
In [7]: X_train, X_validate, y_train, y_validate = train_test_split(X, y, test_size=0.2, random_state=23)
In [8]: feature_cols = [tf.feature_column.numeric_column(col)
...: for col in X.columns]
...:
In [9]: regressor = tf.estimator.DNNRegressor(feature_columns=feature_cols,
...: hidden_units=[3,3],
...: model_dir='fake_model')
In [10]: validation_losses = []
In [11]: for _ in range(10):
...: regressor.train(input_fn=tf.estimator.inputs.pandas_input_fn(
x=X_train,
y=y_train,
num_epochs=None,
shuffle=True), steps=10)
...: validation_losses.append(regressor.evaluate(
input_fn=tf.estimator.inputs.pandas_input_fn(
x=X_validate,
y=y_validate,
num_epochs=1,
shuffle=False)))
In [12]: import matplotlib.pyplot as plt
In [13]: losses = [l['loss'] for l in validation_losses]
In [14]: steps = [s for s in range(10)]
In [15]: plt.scatter(x=steps, y=losses)
Out[15]: <matplotlib.collections.PathCollection at 0x1114e7048>
In [16]: plt.show()