1。您遇到缓存问题

Question

在Tensorflow / Keras中，从https://github.com/pierluigiferrari/ssd_keras运行代码时，请使用估算器：ssd300_evaluation。我收到此错误。

无法获得卷积算法。这可能是因为cuDNN无法初始化，所以请尝试查看上面是否显示了警告日志消息。

这与未解决的问题非常相似：Google Colab Error : Failed to get convolution algorithm.This is probably because cuDNN failed to initialize

我正在运行该问题：

python：3.6.4。

Tensorflow版本：1.12.0。

Keras版本：2.2.4。

CUDA：V10.0。

cuDNN：V7.4.1.5。

NVIDIA GeForce GTX 1080。

我也跑了：

import tensorflow as tf
with tf.device('/gpu:0'):
      a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
      b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
      c = tf.matmul(a, b)
with tf.Session() as sess:
print (sess.run(c))

没有错误或问题。

极简主义的例子是：

 from keras import backend as K
 from keras.models import load_model
 from keras.optimizers import Adam
 from scipy.misc import imread
 import numpy as np
 from matplotlib import pyplot as plt

 from models.keras_ssd300 import ssd_300
 from keras_loss_function.keras_ssd_loss import SSDLoss
 from keras_layers.keras_layer_AnchorBoxes import AnchorBoxes
 from keras_layers.keras_layer_DecodeDetections import DecodeDetections
 from keras_layers.keras_layer_DecodeDetectionsFast import DecodeDetectionsFast
 from keras_layers.keras_layer_L2Normalization import L2Normalization
 from data_generator.object_detection_2d_data_generator import DataGenerator
 from eval_utils.average_precision_evaluator import Evaluator
 import tensorflow as tf
 %matplotlib inline
 import keras
 keras.__version__



 # Set a few configuration parameters.
 img_height = 300
 img_width = 300
 n_classes = 20
 model_mode = 'inference'


 K.clear_session() # Clear previous models from memory.

 model = ssd_300(image_size=(img_height, img_width, 3),
            n_classes=n_classes,
            mode=model_mode,
            l2_regularization=0.0005,
            scales=[0.1, 0.2, 0.37, 0.54, 0.71, 0.88, 1.05], # The scales 
 for MS COCO [0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05]
            aspect_ratios_per_layer=[[1.0, 2.0, 0.5],
                                     [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                                     [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                                     [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                                     [1.0, 2.0, 0.5],
                                     [1.0, 2.0, 0.5]],
            two_boxes_for_ar1=True,
            steps=[8, 16, 32, 64, 100, 300],
            offsets=[0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
            clip_boxes=False,
            variances=[0.1, 0.1, 0.2, 0.2],
            normalize_coords=True,
            subtract_mean=[123, 117, 104],
            swap_channels=[2, 1, 0],
            confidence_thresh=0.01,
            iou_threshold=0.45,
            top_k=200,
            nms_max_output_size=400)

 # 2: Load the trained weights into the model.

 # TODO: Set the path of the trained weights.
 weights_path = 'C:/Users/USAgData/TF SSD 
 Keras/weights/VGG_VOC0712Plus_SSD_300x300_iter_240000.h5'

 model.load_weights(weights_path, by_name=True)

 # 3: Compile the model so that Keras won't complain the next time you load it.

 adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

 ssd_loss = SSDLoss(neg_pos_ratio=3, alpha=1.0)

 model.compile(optimizer=adam, loss=ssd_loss.compute_loss)


dataset = DataGenerator()

# TODO: Set the paths to the dataset here.
dir= "C:/Users/USAgData/TF SSD Keras/VOC/VOCtest_06-Nov-2007/VOCdevkit/VOC2007/"
Pascal_VOC_dataset_images_dir = dir+ 'JPEGImages'
Pascal_VOC_dataset_annotations_dir = dir + 'Annotations/'
Pascal_VOC_dataset_image_set_filename = dir+'ImageSets/Main/test.txt'

# The XML parser needs to now what object class names to look for and in which order to map them to integers.
classes = ['background',
           'aeroplane', 'bicycle', 'bird', 'boat',
           'bottle', 'bus', 'car', 'cat',
           'chair', 'cow', 'diningtable', 'dog',
           'horse', 'motorbike', 'person', 'pottedplant',
           'sheep', 'sofa', 'train', 'tvmonitor']

dataset.parse_xml(images_dirs=[Pascal_VOC_dataset_images_dir],
                  image_set_filenames=[Pascal_VOC_dataset_image_set_filename],
                  annotations_dirs=[Pascal_VOC_dataset_annotations_dir],
                  classes=classes,
                  include_classes='all',
                  exclude_truncated=False,
                  exclude_difficult=False,
                  ret=False)



evaluator = Evaluator(model=model,
                      n_classes=n_classes,
                      data_generator=dataset,
                      model_mode=model_mode)



results = evaluator(img_height=img_height,
                    img_width=img_width,
                    batch_size=8,
                    data_generator_mode='resize',
                    round_confidences=False,
                    matching_iou_threshold=0.5,
                    border_pixels='include',
                    sorting_algorithm='quicksort',
                    average_precision_mode='sample',
                    num_recall_points=11,
                    ignore_neutral_boxes=True,
                    return_precisions=True,
                    return_recalls=True,
                    return_average_precisions=True,
                    verbose=True)

Answer 1

我遇到了同样的问题，由于这一点，我得以解决：

os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = 'true'

或

physical_devices = tf.config.experimental.list_physical_devices('GPU')
if len(physical_devices) > 0:
   tf.config.experimental.set_memory_growth(physical_devices[0], True)

Answer 2

我在使用 CuDNN v 8.0.4 的 Tensorflow 2.4 和 Cuda 11.0 也遇到了同样的问题。我已经浪费了将近 2 到 3 天的时间来解决这个问题。问题只是驱动程序不匹配。我正在安装 Cuda 11.0 Update 1，我认为这是更新 1，所以可能运行良好，但那是那里的罪魁祸首。我卸载了 Cuda 11.0 Update 1 并在没有更新的情况下安装了它。以下是适用于 RTX 2060 6GB GPU 上的 TensorFlow 2.4 的驱动程序列表。

cuDNN v8.0.4 for CUDA 11.0 选择首选操作系统并下载
CUDA Toolkit 11.0 选择您的操作系统

提到了所需的硬件和软件要求列表here

我也不得不这样做

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU') 
tf.config.experimental.set_memory_growth(physical_devices[0], True)

避免这个错误

2020-12-23 21:54:14.971709: I tensorflow/stream_executor/stream.cc:1404] [stream=000001E69C1DA210,impl=000001E6A9F88E20] did not wait for [stream=000001E69C1DA180,impl=000001E6A9F88730]
2020-12-23 21:54:15.211338: F tensorflow/core/common_runtime/gpu/gpu_util.cc:340] CPU->GPU Memcpy failed
[I 21:54:16.071 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
kernel 8b907ea5-33f1-4b2a-96cc-4a7a4c885d74 restarted
kernel 8b907ea5-33f1-4b2a-96cc-4a7a4c885d74 restarted

这些是我得到的一些错误示例

类型 1

UnpicklingError: invalid load key, 'H'.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-2-f049ceaad66a> in <module>

类型 2


InternalError: Blas GEMM launch failed : a.shape=(15, 768), b.shape=(768, 768), m=15, n=768, k=768 [Op:MatMul]

During handling of the above exception, another exception occurred:

类型 3

failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.534375: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.534683: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.534923: E tensorflow/stream_executor/cuda/cuda_blas.cc:226] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.539327: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.539523: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-12-23 21:31:04.539665: W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at conv_ops_fused_impl.h:697 : Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

Answer 3

这是对https://stackoverflow.com/a/56511889/2037998第2点的跟踪。

2。您内存不足

我使用以下代码限制了GPU RAM的使用：

http.get

此代码示例来自：TensorFlow: Use a GPU: Limiting GPU memory growth 将此代码放在您正在使用的任何其他TF / Keras代码之前。

注意：该应用程序可能仍使用比上面的数字更多的GPU RAM。

注2：如果系统还运行其他应用程序（例如UI），这些程序也可能会消耗一些GPU RAM。（Xorg，Firefox等……有时最多可合并1GB的GPU RAM）

Answer 4

我遇到了同样的问题，但在开始时添加这些代码行解决了我的问题：

physical_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

适用于 tensorflow V2。

Answer 5

问题在于tensorflow 1.10.x的新版本以及cudnn 7.0.5和cuda 9.0的版本不兼容。最简单的解决方法是将tensorflow降级为1.8.0

pip install --upgrade tensorflow-gpu == 1.8.0

Answer 6

只需添加

from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

Answer 7

我在RTX 2080上遇到了同样的问题。然后下面的代码对我有用。

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

Answer 8

我得到的错误相同，得到此错误的原因是由于cudaa / cudnn版本与您的tensorflow版本不匹配，有两种方法可以解决此问题：

要么降级Tensorflow版本 pip install --upgrade tensorflowgpu==1.8.0
或者您可以按照Here上的步骤进行操作。

提示：选择您的ubuntu版本并按照以下步骤进行操作：-）

Answer 9

由于三种不同的原因和不同的解决方案，我看到了此错误消息：

1。您遇到缓存问题

我经常通过关闭python进程，删除~/.nv目录（在Linux，rm -rf ~/.nv上）并重新启动Python进程来解决此错误。我不完全知道为什么会这样。它可能至少部分与第二种选择有关：

3。您内存不足

如果显示卡RAM用完了，该错误也会显示出来。使用nvidia GPU，您可以使用nvidia-smi检查图形卡的内存使用情况。这样，您不仅可以读出正在使用的GPU RAM的数量（如果接近极限，可以使用6025MiB / 6086MiB之类的信息，还可以获取正在使用GPU RAM的进程的列表。

如果RAM用完了，则需要重新启动该进程（这将释放RAM），然后采取较少占用内存的方法。一些选项是：

减小批次大小
使用更简单的模型
使用更少的数据
限制TensorFlow GPU内存比例：例如，以下内容将确保TensorFlow使用<= 90％的RAM：

import keras
import tensorflow as tf

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.9
keras.backend.tensorflow_backend.set_session(tf.Session(config=config))

如果不与以上各项一起使用，这会减慢模型评估的速度，大概是因为必须将大型数据集交换进出，才能容纳已分配的少量内存。

3。您的CUDA，TensorFlow，NVIDIA驱动程序等版本不兼容。

如果您从未使用过类似的模型，那么您就不会用完VRAM ，并且缓存是干净的，我将回过头来使用最佳安装方式来设置CUDA + TensorFlow指南-按照https://www.tensorflow.org/install/gpu（而不是NVIDIA / CUDA网站）上的说明，我获得了最大的成功。

Answer 10

Keras包含在上面的TensorFlow 2.0中。所以

删除import keras和
将from keras.module.module import class语句替换为-> from tensorflow.keras.module.module import class
也许您的GPU内存已满。因此在GPU选项中使用allow growth = True。现在不推荐使用。但是导入后使用下面的代码片段可以解决您的问题。

import tensorflow as tf

from tensorflow.compat.v1.keras.backend import set_session

config = tf.compat.v1.ConfigProto()

config.gpu_options.allow_growth = True # dynamically grow the memory used on the GPU

config.log_device_placement = True # to log device placement (on which device the operation ran)

sess = tf.compat.v1.Session(config=config)

set_session(sess)

Answer 11

升级到TF2.0后，我遇到了这个问题。以下开始出现错误：

   outputs = tf.nn.conv2d(images, filters, strides=1, padding="SAME")

我正在使用Ubuntu 16.04.6 LTS（Azure数据科学VM）和TensorFlow 2.0。根据此TensorFlow GPU指令page上的指令升级。这为我解决了这个问题。顺便说一句，它一堆apt-get更新/安装，我执行了所有这些。

Answer 12

我有同样的问题。我正在使用conda环境，因此我的软件包由conda自动管理。我通过限制tensorflow v2（python 3.x）的内存分配解决了这个问题

physical_devices = tf.config.experimental.list_physical_devices(‘GPU’)
tf.config.experimental.set_memory_growth(physical_devices[0], True)

这解决了我的问题。但是，这极大地限制了内存。当我同时运行

nvidia-smi

我看到大约700 mb。因此，为了查看更多选项，可以检查 tensorflow's website

处的代码

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Restrict TensorFlow to only allocate 1GB of memory on the first GPU
  try:
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)

就我而言，上面的代码段完美地解决了这个问题。

注意：我没有尝试通过pip安装tensorflow，这与conda有效地安装了tensorflow一起工作。

Ubuntu：18.04

python：3.8.5

tensorflow：2.2.0

cudnn：7.6.5

cudatoolkit：10.1.243

Answer 13

我遇到了此错误，并通过从系统中卸载所有CUDA和cuDNN版本修复了该错误。然后，我安装了 CUDA Toolkit 9.0 （没有任何修补程序）和 cuDNN v7.4.1 for CUDA 9.0 。

Answer 14

如果没有任何代表，我无法将其添加为对以上 Anurag 和 Obnebion 的两个现有答案的评论，我也无法对答案进行投票，因此我做出了一个新答案，即使它似乎违反了指导方针。无论如何，我最初遇到这个页面上的其他答案的问题，并修复了它，但后来当我开始使用检查点回调时再次遇到相同的消息。在这一点上，只有 Anurag/Obnebion 的答案是相关的。事实证明，我最初将模型保存为 .json，将权重分别保存为 .h5，然后使用 model_from_json 和单独的 model.load_weights 再次获得权重。那行得通（我有 CUDA 10.2 和 tensorflow 2.x）。只有当我试图从检查点回调切换到这个多合一的 save/load_model 时，它才会损坏。这是我在 _save_model 方法中对 keras.callbacks.ModelCheckpoint 所做的小改动：

                            if self.save_weights_only:
                                self.model.save_weights(filepath, overwrite=True)
                            else:
                                model_json = self.model.to_json()
                                with open(filepath+'.json','w') as fb:
                                    fb.write(model_json)
                                    fb.close()
                                self.model.save_weights(filepath+'.h5', overwrite=True)
                                with open(filepath+'-hist.pickle','wb') as fb:
                                    trainhistory = {"history": self.model.history.history,"params": self.model.history.params}
                                    pickle.dump(trainhistory,fb)
                                    fb.close()
                                # self.model.save(filepath, overwrite=True)

历史泡菜转储只是关于堆栈溢出的另一个问题的杂文，当您从检查点回调提前退出时，历史对象会发生什么。好吧，您可以在 _save_model 方法中看到有一行将损失监视器数组从日志字典中拉出...但从未将其写入文件！所以我只是相应地放入了kludge。大多数人不建议像这样使用泡菜。我的代码只是一个 hack 所以没关系。

Answer 15

面对同样的问题，我认为GPU无法一次加载所有数据。我通过减小批次大小来解决它。

Answer 16

看起来图书馆需要一些热身。这不是生产的有效解决方案，但您至少可以继续处理其他错误...

from keras.models import Sequential
import numpy as np
from keras.layers import Dense
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
model = Sequential()
model.add(Dense(1000,input_dim=(784),activation='relu') )  #imnput layer
model.add(Dense(222,activation='relu'))                     #hidden layer
model.add(Dense(100,activation='relu'))   
model.add(Dense(50,activation='relu'))   
model.add(Dense(10,activation='sigmoid'))   
model.compile(optimizer="adam",loss='categorical_crossentropy',metrics=["accuracy"])
x_train = np.reshape(x_train,(60000,784))/255
x_test = np.reshape(x_test,(10000,784))/255
from keras.utils import np_utils
y_train = np_utils.to_categorical(y_train) 
y_test = np_utils.to_categorical(y_test)
model.fit(x_train[:1000],y_train[:1000],epochs=1,batch_size=32)

Answer 17

我有类似的问题。 Tensorflow抱怨说它期望cuDNN的某个版本，但不是找到的那个版本。因此，我从https://developer.nvidia.com/rdp/cudnn-archive下载了预期的版本并进行了安装。现在可以使用了。

Answer 18

如果你是中国人，请确保你的工作路径不包含中文，并把你的batch_size改得越来越小。谢谢！

Answer 19

在笔记本或代码开头添加以下代码行

import tensorflow as tf

physical_devices = tf.config.experimental.list_physical_devices('GPU')

tf.config.experimental.set_memory_growth(physical_devices[0], True)

Answer 20

在我的代码开始时在GPU上启用内存增长就解决了这个问题：

import tensorflow as tf

physical_devices = tf.config.experimental.list_physical_devices('GPU')
print("Num GPUs Available: ", len(physical_devices))
tf.config.experimental.set_memory_growth(physical_devices[0], True)

可用GPU数量：1

参考：https://deeplizard.com/learn/video/OO4HD-1wRN8

Answer 21

我遇到了同样的问题，我的配置是tensorflow1.13.1，cuda10.0，cudnn7.6.4。我尝试将cudnn的版本更改为7.4.2 幸运的是，我解决了这个问题。

Answer 22

我遇到了同样的问题，但解决方案比这里发布的其他解决方案更简单。我同时安装了CUDA 10.0和10.2，但是我只有cuDNN for 10.2，并且此版本（在本文发布时）与TensorFlow GPU不兼容。我刚刚为CUDA 10.0安装了cuDNN，现在一切正常！

Answer 23

正如上面的Anurag Bhalekar所观察到的那样，可以通过在代码中设置并运行模型然后从keras加载load_model（）之前在代码中建立和运行模型来解决此问题。看来这可以正确初始化cuDNN，然后将其用于load_model（）。

就我而言，我正在使用Spyder IDE运行我的所有python脚本。具体来说，我在一个脚本中设置，训练和保存了CNN。之后，另一个脚本将加载保存的模型以进行可视化。如果我打开Spyder并直接运行可视化脚本以加载旧的，已保存的模型，则会收到与上述相同的错误。我仍然可以加载模型并进行修改，但是当我尝试创建预测时，出现了错误。

但是，如果我先在Spyder实例中运行训练脚本，然后在同一Sypder实例中运行可视化脚本，则它可以正常工作而没有任何错误：

#training a model correctly initializes cuDNN
model=Sequential()
model.add(Conv2D(32,...))
model.add(Dense(num_classes,...))
model.compile(...)
model.fit() #this all works fine

然后，下面的代码（包括load_model（））可以正常工作：

#this script relies on cuDNN already being initialized by the script above
from keras.models import load_model
model = load_model(modelPath) #works
model = Model(inputs=model.inputs, outputs=model.layers[1].output) #works
feature_maps = model.predict(img) #produces the error only if the first piece of code is not run

我无法弄清楚这是为什么还是以其他方式解决问题，但是对我来说，在使用load_model（）之前训练一个小的工作keras模型是一种快速而肮脏的修复程序，不需要重新安装cuDNN或其他。

Answer 24

就我而言，当我直接从.json和.h5文件加载模型并尝试预测某些输入的输出时遇到此错误。因此，在做这样的事情之前，我尝试在mnist上训练示例模型这使cudNN可以初始化 enter image description here

Answer 25

1）关闭所有其他使用GPU的笔记本电脑

2）TF 2.0需要cuDNN SDK（> = 7.4.1）

解压并将“ bin”文件夹的路径添加到“环境变量/系统变量/路径”中：“ D：\ Programs \ x64 \ Nvidia \ cudnn \ bin”

Answer 26

如果存在cuDNN的不兼容版本，也会发生此问题，如果您使用conda安装Tensorflow可能就是这种情况，因为conda在安装Tensorflow时也会安装CUDA和cuDNN。

解决方案是使用pip安装Tensorflow，并分别安装CUDA和cuDNN而无需使用conda。如果您拥有CUDA 10.0.130和cuDNN 7.4.1 （tested configurations），则

pip install tensorflow-gpu==1.13.1

Answer 27

我在这个问题上苦苦挣扎了一周。原因非常愚蠢：我使用高分辨率照片进行训练。

希望这可以节省某人的时间：）

Answer 28

在处理AWS Ubuntu实例时，我为此苦了一段时间。

然后，我找到了解决方案，在这种情况下，这很简单。

请勿在带有conda（pip install tensorflow-gpu）的pip（conda install tensorflow-gpu）上安装tensorflow-gpu，使其处于conda环境中，并在正确的环境中安装cudatoolkit和cudnn。

那对我有用，拯救了我的一天，希望对别人有帮助。

请参阅此处，来自LearnermaxRL的原始解决方案： https://github.com/tensorflow/tensorflow/issues/24828#issuecomment-453727142

Answer 29

解决方法：全新安装TF 2.0，并运行了一个简单的Minst教程，没关系，打开了另一个笔记本，尝试运行并遇到了此问题。我存在所有笔记本，然后重新启动Jupyter，仅打开一个笔记本，成功运行问题似乎是内存或在GPU上运行多个笔记本

谢谢

Answer 30

如果您使用的是TensorFlow 1.13版本，则只需在TensorFlow导入行之后添加以下三行。

tmp = b
b = a
a = tmp

环境规格：

a[a[0]]

注意：我在 Mask-RCNN 中遇到此错误。

无法获得卷积算法。这可能是因为cuDNN无法初始化，

30 个答案:

2。您内存不足

1。您遇到缓存问题

3。您内存不足

3。您的CUDA，TensorFlow，NVIDIA驱动程序等版本不兼容。