我正在尝试在裸机系统上运行GNMT TF代码,并且已经设置了CUDA堆栈和tensorflow-gpu
v1.15。 Tensorflow从1.14到1.15进行了一些API更改,但解决了这些问题后,我得以运行代码进行培训和评估。
但是,从NGC容器中查看日志并进行比较,我发现此裸机运行未使用AMP。我研究了Nvidia的文档,并找到了启用它进行培训here的方法。
我在here之前添加了以下行:
opt = tf.train.experimental.enable_mixed_precision_graph_rewrite(opt)
但是,我看不到自动混合精度用于评估,因为优化器仅在Backprop期间调用。因此,我尝试通过修改estimator.py中的图形配置向eval_fn()
添加mixed_precision_rewrite来修改eval函数:
def eval_fn(hparams, ckpt=None, only_translate=False):
model_fn = make_model_fn(hparams)
sess_config = tf.ConfigProto(allow_soft_placement=True)
sess_config.graph_options.rewrite_options.auto_mixed_precision=1
config = tf.estimator.RunConfig(
log_step_count_steps=hparams.log_step_count_steps,
session_config=sess_config)
pred_estimator = tf.estimator.Estimator(
model_fn=model_fn, model_dir=hparams.output_dir, config=config)
return get_metrics(hparams, model_fn, pred_estimator, ckpt, only_translate=only_translate)
并注释掉this call。
但是,这会导致运行错误:
Colocation members, user-requested devices, and framework assigned devices, if any:
tower_0/v0/index_to_string/hash_table (HashTableV2) /device:GPU:0
tower_0/v0/index_to_string/table_init/InitializeTableFromTextFileV2 (InitializeTableFromTextFileV2) /device:GPU:0
tower_0/v0/hash_table_Lookup/LookupTableFindV2 (LookupTableFindV2) /device:GPU:0
2019-11-07 07:51:24.124179: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-11-07 07:51:24.124776: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.803817: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-11-07 07:51:24.804442: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
I1107 07:51:24.825255 140735364352992 session_manager.py:500] Running local_init_op.
2019-11-07 07:51:24.846707: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-11-07 07:51:24.846978: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.870466: I tensorflow/core/kernels/lookup_util.cc:376] Table trying to initialize from file results/vocab.bpe.32000.en is already initialized.
I1107 07:51:24.872127 140735364352992 session_manager.py:502] Done running local_init_op.
2019-11-07 07:51:24.902816: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-11-07 07:51:24.903393: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.950724: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-11-07 07:51:24.951080: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.958353: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.960220: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.961727: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.963636: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.965878: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:24.967928: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-11-07 07:51:25.309130: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-11-07 07:51:25.319260: W tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1775] auto_mixed_precision graph optimizer FAILED: Failed precondition: Expected exactly 1 output from port tower_0/v0/dynamic_seq2seq/decoder/decoder/while/NextIteration_22:0, got 2
2019-11-07 07:51:25.319653: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] auto_mixed_precision failed: Failed precondition: Expected exactly 1 output from port tower_0/v0/dynamic_seq2seq/decoder/decoder/while/NextIteration_22:0, got 2
2019-11-07 07:51:25.497377: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10
I1107 07:53:57.598690 140735364352992 estimator.py:748] Writing to file results/newstest2014_out_4000.tok.de
W1107 07:53:57.614538 140735364352992 deprecation_wrapper.py:119] From /home/mayroy13/Mayank/Mayank/test/nvidia_tf_examples/gnmt_v2/estimator.py:758: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.
W1107 07:53:57.615267 140735364352992 deprecation_wrapper.py:119] From /home/mayroy13/Mayank/Mayank/test/nvidia_tf_examples/gnmt_v2/estimator.py:685: The name tf.gfile.Remove is deprecated. Please use tf.io.gfile.remove instead.
W1107 07:53:57.615499 140735364352992 deprecation_wrapper.py:119] From /home/mayroy13/Mayank/Mayank/test/nvidia_tf_examples/gnmt_v2/estimator.py:686: The name tf.gfile.Copy is deprecated. Please use tf.io.gfile.copy instead.
Warning: No built-in rules for language de.
Detokenizer Version $Revision: 4134 $
Language: de
任何潜在客户都将有助于启用自动混合精度进行评估。谢谢:)
我也在here的Nvidia仓库中添加了一个问题。