训练我的模型后,我保存它,然后尝试恢复它以计算开发设置的成本/准确度。
在恢复之前,我运行以下语句来弄清楚我的变量。
from tensorflow.python.tools import inspect_checkpoint as chkp
chkp.print_tensors_in_checkpoint_file("./trained_models/my_nn_model.ckpt", tensor_name='',
all_tensors=True, all_tensor_names=True)
我看到以下输出:
tensor_name: biases/b1
[[ 0.4088161 ]
[ 0.73051345]
[ 0.861546 ]
[-0.01601586]]
tensor_name: biases/b1/Adam
[[ 0.06940479]
[-0.01317821]
[ 0.00601695]
[ 0.0169837 ]]
tensor_name: biases/b1/Adam_1
[[0.00422197]
[0.00048599]
[0.00077043]
[0.00035076]]
tensor_name: biases/b2
[[ 0.80142576]
[-0.09536028]
[ 0.31366938]]
tensor_name: biases/b2/Adam
[[ 0.08435135]
[ 0.03394406]
[-0.04104255]]
tensor_name: biases/b2/Adam_1
[[0.00650834]
[0.00206493]
[0.00083752]]
tensor_name: biases/b3
[[-0.6808493 ]
[ 0.42616928]]
tensor_name: biases/b3/Adam
[[ 0.11350942]
[-0.11350942]]
tensor_name: biases/b3/Adam_1
[[0.00629836]
[0.00629836]]
tensor_name: train/beta1_power
0.004638391
tensor_name: train/beta2_power
0.9502551
tensor_name: weights/W1
[[ 0.35077223 0.30753523 0.19711483 -0.5701605 0.22447775]
[-0.7757121 -0.20513503 0.4545326 -0.14088248 0.4854558 ]
[-0.66474247 0.28792825 0.06203659 -0.0888676 -0.74835175]
[-0.41984704 -0.5626613 -0.02844676 0.77327466 0.19199598]]
tensor_name: weights/W1/Adam
[[ 0.13355881 0.4353028 0.4103592 0.14981574 0.27531895]
[ 0.01698016 -0.07343768 -0.11361112 -0.04086655 -0.07324728]
[-0.00324349 0.02257502 0.04864099 0.02607765 0.0225742 ]
[ 0.11069385 0.09307133 0.06229053 0.07731174 0.08953418]]
tensor_name: weights/W1/Adam_1
[[0.06442691 0.11718791 0.16552295 0.10027011 0.11132942]
[0.00597157 0.01351114 0.01625086 0.0113084 0.01210043]
[0.0034455 0.0109939 0.04340019 0.02456977 0.01193165]
[0.010284 0.01212158 0.01438992 0.01114361 0.01298358]]
tensor_name: weights/W2
[[ 0.6157185 -0.02184171 0.5163279 -0.3498895 ]
[-0.15082173 0.21863511 -0.21755247 0.39887637]
[-0.5565993 0.65659076 -0.6370119 0.41734824]]
tensor_name: weights/W2/Adam
[[ 0.39385152 0.27537686 0.01230302 -0.05157183]
[ 0.08531421 0.15998691 0.00756624 0.01899205]
[-0.11271227 -0.18292099 -0.00443625 -0.0315922 ]]
tensor_name: weights/W2/Adam_1
[[0.11990622 0.17129508 0.00665622 0.0358038 ]
[0.03782089 0.06448739 0.00252486 0.01346588]
[0.00787948 0.01284081 0.00035877 0.00662182]]
tensor_name: weights/W3
[[ 0.5939301 0.605848 -0.59496546]
[-0.23180145 0.17120583 0.04733036]]
tensor_name: weights/W3/Adam
[[ 0.40406024 0.07094829 0.11723397]
[-0.40406027 -0.07094829 -0.11723398]]
tensor_name: weights/W3/Adam_1
[[0.11013244 0.03589008 0.01292834]
[0.11013244 0.03589008 0.01292834]]
我希望看到偏差/ b1,重量/ W1等。
但我不想看到偏见/ b1 / Adam,偏见/ b1 / Adam_1等。
tensorflow文档说明如下: “估算器会自动保存和恢复变量(在model_dir中)。” 当我在我的模型中使用AdamOptimizer时,我假设我在上面看到的这些额外变量(biases / b1 / Adam等)与此声明有关。
但这很令人困惑。
在训练我的模型后,哪个b1变量是我的最终变量?例如,它是偏见/ b1,偏差/ b1 /亚当,还是偏见/ b1 / Adam_1?
看似这些偏见/ b1 / Adam ..变量不受我的程序赞赏,当我恢复我的模型时,我得到一个运行时错误,说“无法添加名称权重为/ W1 / Adam的op,因为该名称是已被占用”。我该如何解决这个问题?
答案 0 :(得分:0)
你的最终变量是' biases / b1'。另外两个是由Adam优化器创建的变量,您在训练期间使用这些变量来维护梯度的一阶和二阶导数的估计。如果您想在恢复模型后继续训练,这些变量对您来说可能仍然有价值。
This answer显示,您只能使用下一行代码保存可训练变量:
saver=tf.train.Saver(var_list=tf.trainable_variables())
如果你引入了新的变量,不同于在检查点保存的变量,你需要手动初始化它们,因为Saver不会帮助你。