内核似乎已经死亡。它将自动重启。遇到内存问题?

时间:2019-04-18 01:30:43

标签: python-3.x tensorflow machine-learning neural-network tensorflow2.0

内核在运行一些代码后死亡
我尝试运行代码以使用生成器生成示例图像 我试图更新conda和Jupiter,但是它们都不起作用

我一直在关注GPU的内存使用情况,但是它并没有那么多地使用GPU

  

tensorflow2.0,ubuntu 18.10,cuda 10.0
python 3.5,

def make_generator_model():
    model = tf.keras.Sequential()
    model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Reshape((7, 7, 256)))
    assert model.output_shape == (None, 7, 7, 256) # Note: None is the batch size

    model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
    assert model.output_shape == (None, 7, 7, 128)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    assert model.output_shape == (None, 14, 14, 64)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 28, 28, 1)

    return model
generator = make_generator_model()

noise = tf.random.normal([1, 100])
generated_image = generator(noise, training=False)
  

[I 10:20:06.664 NotebookApp] KernelRestarter:重新启动内核(1/5),   保持随机端口警告:root:kernel   4406ce3b-1b5b-4ef8-aba9-d5fd9ed129e7重新启动2019-04-18   10:20:21.002451:我   tensorflow / stream_executor / platform / default / dso_loader.cc:42]   成功打开动态库libcuda.so.1 2019-04-18   10:20:21.081020:我   tensorflow / core / common_runtime / gpu / gpu_device.cc:1589]找到设备0   具有属性:名称:TITAN Xp主修:6主修:1   memoryClockRate(GHz):1.582 pciBusID:0000:42:00.0 totalMemory:   11.91GiB freeMemory:340.69MiB 2019-04-18 10:20:21.081054:我tensorflow / core / common_runtime / gpu / gpu_device.cc:1712]添加可见   gpu设备:0 2019-04-18 10:20:21.081382:I   tensorflow / core / platform / cpu_feature_guard.cc:142]您的CPU支持   TensorFlow二进制文件未编译使用的指令:AVX2   FMA 2019-04-18 10:20:21.107510:I   tensorflow / compiler / xla / service / service.cc:168] XLA服务   0x55de6ead0990在平台CUDA上执行计算。设备:   2019-04-18 10:20:21.107562:我   tensorflow / compiler / xla / service / service.cc:175] StreamExecutor   设备(0):TITAN Xp,计算能力6.1 2019-04-18   10:20:21.127890:我   tensorflow / core / platform / profile_utils / cpu_utils.cc:94] CPU频率:   3493050000 Hz 2019-04-18 10:20:21.129460:I   tensorflow / compiler / xla / service / service.cc:168] XLA服务   0x55de6eed7eb0在平台Host上执行计算。设备:   2019-04-18 10:20:21.129503:我   tensorflow / compiler / xla / service / service.cc:175] StreamExecutor   设备(0):,2019-04-18 10:20:21.129616:I   tensorflow / core / common_runtime / gpu / gpu_device.cc:1712]添加可见   gpu设备:0 2019-04-18 10:20:21.129722:I   tensorflow / stream_executor / platform / default / dso_loader.cc:42]   成功打开动态库libcudart.so.10.0 2019-04-18   10:20:21.130785:我   tensorflow / core / common_runtime / gpu / gpu_device.cc:1120]设备   将StreamExecutor与强度1边缘矩阵互连:2019-04-18   10:20:21.130807:我   tensorflow / core / common_runtime / gpu / gpu_device.cc:1126] 0   2019-04-18 10:20:21.130819:我   tensorflow / core / common_runtime / gpu / gpu_device.cc:1139] 0:N   2019-04-18 10:20:21.131090:我   tensorflow / core / common_runtime / gpu / gpu_device.cc:1260]已创建   TensorFlow设备(/ job:localhost /副本:0 /任务:0 /设备:GPU:0与   115 MB内存)->物理GPU(设备:0,名称:TITAN Xp,PCI总线ID:   0000:42:00.0,计算能力:6.1)2019-04-18 10:20:24.168083:I   tensorflow / stream_executor / platform / default / dso_loader.cc:42]   成功打开动态库libcublas.so.10.0 2019-04-18   10:20:24.331094:我   tensorflow / stream_executor / platform / default / dso_loader.cc:42]   成功打开动态库libcudnn.so.7 2019-04-18   10:20:24.789774:E tensorflow / stream_executor / cuda / cuda_dnn.cc:329]   无法创建cudnn句柄:CUDNN_STATUS_INTERNAL_ERROR 2019-04-18   10:20:24.791468:E tensorflow / stream_executor / cuda / cuda_dnn.cc:329]   无法创建cudnn句柄:CUDNN_STATUS_INTERNAL_ERROR 2019-04-18   10:20:24.791484:F tensorflow /核心/内核/conv_grad_input_ops.cc:949]   检查失败:stream-> parent()-> GetConvolveBackwardDataAlgorithms(   conv_parameters.ShouldIncludeWinogradNonfusedAlgo(stream-> parent()),   &algorithms)[I 10:20:27.669 NotebookApp] KernelRestarter:重新启动   内核(1/5),请保留随机端口警告:root:kernel   4406ce3b-1b5b-4ef8-aba9-d5fd9ed129e7重新启动

1 个答案:

答案 0 :(得分:0)

根据错误的输出,似乎是内存问题。

“总内存:11.91GiB freeMemory:340.69MiB”

尝试重新启动PC,并在重新打开PC时立即查看有多少RAM可用,然后再次执行代码,查看其是否正常工作。