我正在使用Colab环境通过lstm模型进行实验。但是无法保存经过训练的模型。
sess = tf.keras.backend.get_session()
training_model = lstm_model(seq_len=100, batch_size=128, stateful=False)
tpu_model = tf.contrib.tpu.keras_to_tpu_model( training_model,
strategy=tf.contrib.tpu.TPUDistributionStrategy(
tf.contrib.cluster_resolver.TPUClusterResolver(TPU_WORKER)))
tpu_model.fit_generator( training_generator(seq_len=100, batch_size=1024), steps_per_epoch=100,
epochs=4
)
export_path = '/content/output/'
tf.saved_model.simple_save(
sess,
export_path,
inputs={'input_image': tpu_model.input},
outputs={t.name: t for t in tpu_model.outputs})
这里是例外:
FailedPreconditionError Traceback (most recent call last)
<ipython-input-13-020e67d3772b> in <module>()
29 export_path,
30 inputs={'input_image': tpu_model.input},
---> 31 outputs={t.name: t for t in tpu_model.outputs})
...skipped....
FailedPreconditionError: Error while reading resource variable. This could mean that the variable was uninitialized. Not found: Resource worker/TFOptimizer/iterations/N10tensorflow3VarE does not exist.
[[{{node ReadVariables_8976001795006639924/_2}} = _ReadVariablesOp[N=40, dtypes=[DT_INT64, DT_INT64, DT_INT64, DT_INT64, DT_INT64, ..., DT_FLOAT, DT_INT64, DT_INT64, DT_INT64, DT_INT64], _device="/job:worker/replica:0/task:0/device:CPU:0"](VarHandles_14315951673884632260/_0, VarHandles_14315951673884632260/_0:1, VarHandles_14315951673884632260/_0:2, VarHandles_14315951673884632260/_0:3, VarHandles_14315951673884632260/_0:4, VarHandles_14315951673884632260/_0:5, VarHandles_14315951673884632260/_0:6, VarHandles_14315951673884632260/_0:7, VarHandles_14315951673884632260/_0:8, VarHandles_14315951673884632260/_0:9, VarHandles_14315951673884632260/_0:10, VarHandles_14315951673884632260/_0:11, VarHandles_14315951673884632260/_0:12, VarHandles_14315951673884632260/_0:13, VarHandles_14315951673884632260/_0:14, VarHandles_14315951673884632260/_0:15, VarHandles_14315951673884632260/_0:16, VarHandles_14315951673884632260/_0:17, VarHandles_14315951673884632260/_0:18, VarHandles_14315951673884632260/_0:19, VarHandles_14315951673884632260/_0:20, VarHandles_14315951673884632260/_0:21, VarHandles_14315951673884632260/_0:22, VarHandles_14315951673884632260/_0:23, VarHandles_14315951673884632260/_0:24, VarHandles_14315951673884632260/_0:25, VarHandles_14315951673884632260/_0:26, VarHandles_14315951673884632260/_0:27, VarHandles_14315951673884632260/_0:28, VarHandles_14315951673884632260/_0:29, VarHandles_14315951673884632260/_0:30, VarHandles_14315951673884632260/_0:31, VarHandles_14315951...
[[{{node ReadVariables_16894311020792346126/_3_G1412}} = _Send[T=DT_FLOAT, client_terminated=false, recv_device="/job:worker/replica:0/task:0/device:CPU:0", send_device="/job:worker/replica:0/task:0/device:TPU:0", send_device_incarnation=8311516724619575166, tensor_name="edge_133_ReadVariables_16894311020792346126/_3", _device="/job:worker/replica:0/task:0/device:TPU:0"](ReadVariables_16894311020792346126/_3:8)]]
请告知
答案 0 :(得分:0)
如果您将tf.saved_model.simple_save()
通话替换为例如,
tpu_model.save_weights(os.path.join(export_path, 'weights.h5'), overwrite=True)
(该示例和其他示例从https://colab.research.google.com/notebooks/tpu.ipynb的底部链接)