我有一个保存SavedModel的函数。
在训练过程中,每N个纪元会不时调用此函数,以用较新的SavedModel替换当前的SavedModel。
def save_savedmodel(sess, snapshot_dir):
example = tf.get_collection("example")[0]
label = tf.get_collection("label")[0]
softmax = tf.get_collection("softmax")[0]
logits = tf.get_collection("logits")[0]
savedmodel_dir = snapshot_dir + "/saved_model"
shutil.rmtree(savedmodel_dir, ignore_errors=True)
tf.saved_model.simple_save(
sess,
savedmodel_dir,
inputs={"example": example},
outputs={"softmax": softmax, "logits": logits},
)
该脚本一直有效,直到我将tensorflow升级到1.12。
现在,第一次调用此函数,它将起作用,随后的调用将导致附加的堆栈跟踪,并抱怨有两个名为batch_normalization/beta/Adagrad
的变量:
ValueError: At least two variables have the same name:batch_normalization/beta/Adagrad
第一个困惑是事实并非如此。这就是我在calling simple_save()
for name in [n.name for n in tf.get_default_graph().as_graph_def().node]:
print(name)
batch_normalization/beta/Adagrad/Initializer/Const
batch_normalization/beta/Adagrad
batch_normalization/beta/Adagrad/Assign
batch_normalization/beta/Adagrad/read
batch_normalization/beta/Adagrad/Initializer/Const_1
batch_normalization/beta/Adagrad_1
batch_normalization/beta/Adagrad/Assign_1
batch_normalization/beta/Adagrad/read_1
有batch_normalization/beta/Adagrad
和batch_normalization/beta/Adagrad_1
:这是正常的还是鱼腥的?
让我感到困惑的是,我只是保存模型。我不是恢复模型。因此,在我再次调用此函数之前,我的图形是否已损坏,或者保存模型是否以某种方式修改了会话?
Pass your op to the equivalent parameter main_op instead.
Traceback (most recent call last):
File "src/tictactoe/train.py", line 412, in <module>
train(sess)
File "src/tictactoe/train.py", line 312, in train
write_snapshot(sess, saver, global_step_val, "champion")
File "src/tictactoe/train.py", line 185, in write_snapshot
save_savedmodel(sess, snapshot_dir)
File "src/tictactoe/train.py", line 209, in save_savedmodel
outputs={"softmax": softmax, "logits": logits},
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/saved_model/simple_save.py", line 85, in simple_save
clear_devices=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/saved_model/builder_impl.py", line 415, in add_meta_graph_and_variables
saver = self._maybe_create_saver(saver)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/saved_model/builder_impl.py", line 272, in _maybe_create_saver
allow_empty=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1102, in __init__
self.build()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1114, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1151, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 773, in _build_internal
saveables = self._ValidateAndSliceInputs(names_to_saveables)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 673, in _ValidateAndSliceInputs
names_to_saveables = BaseSaverBuilder.OpListToDict(names_to_saveables)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 572, in OpListToDict
name)
ValueError: At least two variables have the same name: batch_normalization/beta/Adagrad