我遇到了Keras的问题。基本上,当我尝试使用conv2d图层拟合模型时,它会出现以下错误“细分错误(核心已转储)”。
我的代码在CPU上工作。它也可以在没有任何conv2d层的情况下工作(即使它对我的用例无效)。我已经安装了cuda,cudnn和tensorflow。我尝试过重新安装keras和tensorflow。
代码:
def model_build():
model = Sequential()
model.add(Conv2D(input_shape = (env_size()[0], env_size()[1], 1), filters=4, kernel_size=(3,3), strides=1, activation=swisher))
model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
model.add(Flatten())
model.add(Dense(128, activation='softmax'))
model.add(Dense(4, activation='softmax'))
return model
if __name__ == '__main__':
y = model_build()
y.compile(loss = "mean_squared_error", optimizer = 'adam')
y.fit(x=env(), y = np.array([[0,0,0,0]])
错误:
Using TensorFlow backend.
Epoch 1/1
2019-03-27 05:52:27.687323: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-27 05:52:27.789975: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-27 05:52:27.790819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.83
pciBusID: 0000:01:00.0
totalMemory: 5.73GiB freeMemory: 5.40GiB
2019-03-27 05:52:27.790834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2019-03-27 05:52:28.068080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-27 05:52:28.068115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] 0
2019-03-27 05:52:28.068121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0: N
2019-03-27 05:52:28.068487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5147 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
2019-03-27 05:52:28.177752: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.337277: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.500486: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.586280: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.675738: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
Segmentation fault (core dumped)
编辑:
独立的示例。
import numpy as np
import keras
model = keras.models.Sequential() #Sequential model type.
model.add(keras.layers.Conv2D(filters=1, kernel_size=(3,3), strides = 1, activation="sigmoid")) #Convolutional layer.
model.add(keras.layers.Flatten()) #Flatten layer.
model.add(keras.layers.Dense(4)) #Dense layer of 4 units.
model.compile(loss='mean_squared_error', optimizer='adam') #compile model.
y = np.random.rand(1,4) #Random expected output
x = np.random.rand(1, 38, 21, 1) # Random input.
model.fit(x, y) #And fit...
EDIT2:
Keras版本:“ v2.1.6-tf”
Tensorflow-GPU版本:“ v1.12”
Python版本:“ v3.5.2”
CUDA版本:“ v9.0.176”
CUDNN版本:'v7.2.1.38-1 + cuda9.0
Ubuntu版本:“ v16.04”
答案 0 :(得分:0)
您的GPU似乎没有足够的内存。您的模型似乎不太大,所以我想问题可能出在线路上:
y.fit(x=env(), y = np.array([[0,0,0,0]])
env()
的输出可能太大,无法由您的GPU内存处理。
答案 1 :(得分:0)
您的MWE对我来说工作正常(如果我将, input_shape=(38, 21, 1)
添加到第一卷积层中):
import numpy as np
import keras
model = keras.models.Sequential() #Sequential model type.
model.add(keras.layers.Conv2D(filters=1, kernel_size=(3,3), strides = 1, activation="sigmoid", input_shape=(38, 21, 1))) #Convolutional layer.
model.add(keras.layers.Flatten()) #Flatten layer.
model.add(keras.layers.Dense(4)) #Dense layer of 4 units.
model.compile(loss='mean_squared_error', optimizer='adam') #compile model.
y = np.random.rand(2, 4) #Random expected output
x = np.random.rand(2, 38, 21, 1) # Random input.
model.fit(x, y)
这意味着您的问题必须来自系统或安装。
查看compatibility chart of tensorflow表明您的python,tensorflow和CUDA版本应该兼容。
对于您的配置,建议使用cuDNN版本7.0.x
。
您使用的cuDNN版本7.2
可能不兼容。
尝试安装/使用cuDNN 7.0.x
。