Tensorflow:在cpu中使用CUDNNLSTM中训练的模型

时间:2018-05-02 05:09:46

标签: python tensorflow cudnn

我使用GPU在tensorflow中使用CUDNNLSTM训练了一个模型。当我尝试在cpu中使用模型进行推理时,我收到此错误:

Invalid argument: No OpKernel was registered to support Op 'CudnnRNN' with these attrs.  Registered devices: [CPU], Registered kernels:
  <no registered kernels>

     [[Node: cudnn_lstm/CudnnRNN = CudnnRNN[T=DT_FLOAT, direction="bidirectional", dropout=0, input_mode="linear_input", is_training=false, rnn_mode="lstm", seed=87654321, seed2=4567](Reshape_1, cudnn_lstm/zeros, cudnn_lstm/zeros_1, cudnn_lstm/opaque_kernel/read)]]

那么,我们如何在cpu中使用这个模型呢?

3 个答案:

答案 0 :(得分:3)

请查看位于https://github.com/tensorflow/tensorflow/blob/r1.6/tensorflow/contrib/cudnn_rnn/python/layers/cudnn_rnn.py

的CuDNN LSTM层的tensorflow源代码中的注释。

他们从第83行开始描述了您想要做什么。基本上,在使用CuDNN层进行训练之后,您需要将权重转移到使用CuDNN兼容LSTM单元制作的模型中。这样的模型将在CPU和GPU上运行。而且,据我所知,张量流中的CuDNN LSTM层是时间密集型的,所以不要忘记转置您的输入(我不确定最新的张量流版本中的这一点,请确认这一点)。

基于上面的简短完整示例,请查看melgor的要旨:

https://gist.github.com/melgor/41e7d9367410b71dfddc33db34cba85f?short_path=29ebfc6

答案 1 :(得分:2)

Reason: tensorflow doesn`t see your GPU

Fix:安装CUDA Toolkit和cuDNN SDK(与您的tf版本兼容),运行:'pip uninstall tensorflow'; 'pip install tensorflow-gpu'

Summary:
    1. check if tensorflow sees your GPU (optional)
    2. check if your videocard can work with tensorflow (optional)
    3. find versions of CUDA Toolkit and cuDNN SDK, compatible with your tf version
        (https://www.tensorflow.org/install/source#linux)
    4. install CUDA Toolkit
        (https://developer.nvidia.com/cuda-toolkit-archive)
    5. install cuDNN SDK 
        (https://developer.nvidia.com/rdp/cudnn-archive)
    6. pip uninstall tensorflow; pip install tensorflow-gpu 
    7. check if tensorflow sees your GPU
    * source - https://www.tensorflow.org/install/gpu


Detailed instruction:
    1. check if tensorflow sees your GPU (optional)
        from tensorflow.python.client import device_lib
        def get_available_devices():
            local_device_protos = device_lib.list_local_devices()
            return [x.name for x in local_device_protos]
        print(get_available_devices()) 
        # my output was => ['/device:CPU:0']
        # good output must be => ['/device:CPU:0', '/device:GPU:0']
    2. check if your card can work with tensorflow (optional)
        * my PC: GeForce GTX 1060 notebook (driver version - 419.35), windows 10, jupyter notebook
        * tensorflow needs Compute Capability 3.5 or higher. (https://www.tensorflow.org/install/gpu#hardware_requirements)
        - https://developer.nvidia.com/cuda-gpus
        - select "CUDA-Enabled GeForce Products"
        - result - "GeForce GTX 1060    Compute Capability = 6.1"
        - my card can work with tf!
    3. find versions of CUDA Toolkit and cuDNN SDK, that you need
        a) find your tf version
            import tensorflow as tf
            print(tf.__version__)
            # my output was => 1.13.1
        b) find right versions of CUDA Toolkit and cuDNN SDK for your tf version
            https://www.tensorflow.org/install/source#linux
            * it is written for linux, but worked in my case
            see, that tensorflow_gpu-1.13.1 needs: CUDA Toolkit v10.0, cuDNN SDK v7.4
    4. install CUDA Toolkit
        a) install CUDA Toolkit 10.0
            https://developer.nvidia.com/cuda-toolkit-archive
            select: CUDA Toolkit 10.0 and download base installer (2 GB)
            installation settings: select only CUDA
                (my installation path was: D:\Programs\x64\Nvidia\Cuda_v_10_0\Development)
        b) add environment variables:
            system variables / path must have:
                D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\bin
                D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\libnvvp
                D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\extras\CUPTI\libx64
                D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\include
    5. install cuDNN SDK
        a) download cuDNN SDK v7.4
            https://developer.nvidia.com/rdp/cudnn-archive (needs registration, but it is simple)
            select "Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 10.0"
        b) add path to 'bin' folder into "environment variables / system variables / path":
            D:\Programs\x64\Nvidia\cudnn_for_cuda_10_0\bin
    6.  pip uninstall tensorflow
        pip install tensorflow-gpu 
    7. check if tensorflow sees your GPU
        restart your PC
        print(get_available_devices()) 
        # now this code should return => ['/device:CPU:0', '/device:GPU:0']

答案 2 :(得分:0)

不起作用的原因是您的JSON文件(具有模型架构)仍配置为CuDNNLSTM。 Keras现在可以自动将CuDNNLSTM权重加载到LSTM体系结构中,但是不会自动为您更改体系结构。

此问题的解决方法很简单:打开您的.json文件,然后将CuDNNLSTM的每个实例更改为LSTM。保存JSON文件,那么您应该可以从.h5文件中加载权重。