如何还原Cudnn LSTM

时间:2018-12-17 14:42:52

标签: tensorflow lstm cudnn

我正在努力恢复由Cudnn Lstm培训的网络。我正在使用Tensorflow 1.12,Cuda 9.0和Cudnn 7.4。我已经使用训练并保存了我的网络,

with graph.as_default():
    inputs_ = tf.placeholder(tf.float32, [None, seq_len, n_channels], name='inputs')
    labels_ = tf.placeholder(tf.float32, [None, n_classes], name='labels')
    keep_prob_ = tf.placeholder(tf.float32, name='keep')
    learning_rate_ = tf.placeholder(tf.float32, name='learning_rate')

    lstm_in = tf.transpose(inputs_, (1, 0, 2))
    cudnn_lstm =tf.contrib.cudnn_rnn.CudnnLSTM(num_layers=2, num_units=36, dtype=tf.float32)  
    outputs_, _ = cudnn_lstm(lstm_in)

with graph.as_default():
    saver = tf.train.Saver()

with tf.Session(graph=graph, config=tf.ConfigProto(log_device_placement=True)) as sess:
    sess.run(tf.global_variables_initializer())
    iteration = 1

    for e in range(epochs):
        for x, y in get_batches(X_tr, y_tr, batch_size):

            feed = {inputs_: x, labels_: y, keep_prob_: 0.5, learning_rate_: learning_rate}

            loss, _,  acc = sess.run([cost, optimizer, accuracy], feed_dict=feed)
  saver.save(sess, "checkpoints/lstm.ckpt")

我将其保存在检查点文件中。之后,使用下面显示的代码,我试图恢复输入占位符或准确性之类的变量。

graph = tf.Graph()
with tf.Graph().as_default(), tf.device('/gpu:0'):
    with tf.Session() as sess:
        saver = tf.train.import_meta_graph('checkpoints/lstm.ckpt.meta')
        saver.restore(sess, tf.train.latest_checkpoint('checkpoints/.'))  

        input = graph.get_tensor_by_name('inputs:0')
        output = graph.get_tensor_by_name('labels:0')
        pred = graph.get_tensor_by_name('pred_y:0')
        accuracy = graph.get_tensor_by_name('accuracy:0')
        keep_ = graph.get_tensor_by_name('keep:0')

但是我收到此错误“名称'cudnn_lstm / opaque_kernel_saveable'指的是不在图中的操作'。 我对cudnn_rnn.py

的不透明参数一无所知

这个问题有解决方案吗?

错误输出:

2018-12-24 09:09:32.042319: I 
tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports 
instructions that this TensorFlow binary was not compiled to use: AVX2 
FMA
2018-12-24 09:09:32.129542: I 
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful 
NUMA node read from SysFS had negative value (-1), but there must be 
at least one NUMA node, so returning NUMA node zero
2018-12-24 09:09:32.130121: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 
with properties: 
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 
1.7085
pciBusID: 0000:01:00.0
totalMemory: 5.93GiB freeMemory: 4.84GiB
2018-12-24 09:09:32.130135: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible 
gpu devices: 0
2018-12-24 09:09:32.332781: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device 
interconnect StreamExecutor with strength 1 edge matrix:
2018-12-24 09:09:32.332806: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2018-12-24 09:09:32.332813: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2018-12-24 09:09:32.332964: I 
tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created 
TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 
4595 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 
6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "restore.py", line 60, in <module>
saver = tf.train.import_meta_graph('checkpoints/lstm.ckpt.meta')
File "/home/ks/.local/lib/python3.6/site- 
packages/tensorflow/python/training/saver.py", line 1674, in 
import_meta_graph
meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
File "/home/ks/.local/lib/python3.6/site- 
packages/tensorflow/python/training/saver.py", line 1696, in 
_import_meta_graph_with_return_elements
**kwargs))
File "/home/ks/.local/lib/python3.6/site- 
packages/tensorflow/python/framework/meta_graph.py", line 852, in 
import_scoped_meta_graph_with_return_elements
ops.prepend_name_scope(value, scope_to_prepend_to_names))
File "/home/ks/.local/lib/python3.6/site- 
packages/tensorflow/python/framework/ops.py", line 3490, in 
as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, 
allow_operation)
File "/home/ks/.local/lib/python3.6/site- 
packages/tensorflow/python/framework/ops.py", line 3550, in 
_as_graph_element_locked
"graph." % repr(name))
 KeyError: "The name 'cudnn_lstm/opaque_kernel_saveable' refers to an 
 Operation not in the graph."

0 个答案:

没有答案