Keras model.train_on_batch使我的内核崩溃,但model.fit不会崩溃,即使使用train_on_batch

时间:2019-07-02 17:21:52

标签: python tensorflow machine-learning keras neural-network

我正在用Keras和Tensorflow后端训练DNA序列列表上的卷积神经网络,以预测它们的某些特征。我设置了两个网络:

  1. 填充我的一键编码序列的开头,以便所有序列都具有相同的长度,这意味着我可以使用model.fit将它们一次通过。由于我无法使用Conv2D掩盖输入内容,因此填充减少了我的训练数据。

  2. 按序列训练模型序列,并且不填充它们,因此模型采用可变的输入长度。我希望这样做可以提高模型的准确性,但是每次我遍历序列列表并调用model.train_on_batch时,我的iPython内核都会崩溃。

这是我传递填充输入的模型:

kernel_length=50
kernel_length2=25
kernel_length3=12
conv_layers=128
hot_dim=5
model = Sequential()

model.add(Conv2D(conv_layers, (kernel_length, hot_dim), padding='valid', input_shape=(x_train.shape[1], hot_dim, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(Conv2D(conv_layers, (kernel_length2, 1), padding='valid', input_shape=(x_train.shape[1], hot_dim, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(Conv2D(conv_layers, (kernel_length3, 1), padding='valid', input_shape=(x_train.shape[1], hot_dim, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(MaxPooling2D(pool_size=(35,1), padding='valid'))

model.add(Flatten())
model.add(Dense(128, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(64, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(16, activation="relu"))
model.add(Dense(1, activation='linear'))

model.compile(loss='mean_squared_error', optimizer='adam',metrics=["accuracy", "mae", "mse"])

model.summary()

模型摘要返回:

    _________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_60 (Conv2D)           (None, 851, 1, 128)       32128     
_________________________________________________________________
activation_58 (Activation)   (None, 851, 1, 128)       0         
_________________________________________________________________
dropout_92 (Dropout)         (None, 851, 1, 128)       0         
_________________________________________________________________
conv2d_61 (Conv2D)           (None, 827, 1, 128)       409728    
_________________________________________________________________
activation_59 (Activation)   (None, 827, 1, 128)       0         
_________________________________________________________________
dropout_93 (Dropout)         (None, 827, 1, 128)       0         
_________________________________________________________________
conv2d_62 (Conv2D)           (None, 816, 1, 128)       196736    
_________________________________________________________________
activation_60 (Activation)   (None, 816, 1, 128)       0         
_________________________________________________________________
dropout_94 (Dropout)         (None, 816, 1, 128)       0         
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 23, 1, 128)        0         
_________________________________________________________________
flatten_16 (Flatten)         (None, 2944)              0         
_________________________________________________________________
dense_69 (Dense)             (None, 128)               376960    
_________________________________________________________________
dropout_95 (Dropout)         (None, 128)               0         
_________________________________________________________________
dense_70 (Dense)             (None, 64)                8256      
_________________________________________________________________
dropout_96 (Dropout)         (None, 64)                0         
_________________________________________________________________
dense_71 (Dense)             (None, 16)                1040      
_________________________________________________________________
dense_72 (Dense)             (None, 1)                 17        
=================================================================
Total params: 1,024,865
Trainable params: 1,024,865
Non-trainable params: 0
_________________________________________________________________

我的批处理模型略有不同,因为我必须使用GlobalMaxPooling2D

kernel_length=20
kernel_length2=18
kernel_length3=15
conv_layers=64
model = Sequential()

model.add(Conv2D(conv_layers, (kernel_length, 4), padding='valid', input_shape=(None, 4, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(Conv2D(conv_layers, (kernel_length2, 1), padding='valid', input_shape=(None, 4, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(Conv2D(conv_layers, (kernel_length3, 1), padding='valid', input_shape=(None, 4, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))

model.add(GlobalMaxPooling2D())
model.add(Dense(128, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(64, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(16, activation="relu"))
model.add(Dense(1, activation='linear'))

model.compile(loss='mean_squared_error', optimizer='adam',metrics=["accuracy", "mae", "mse"])

model.summary()

对此的模型摘要是

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_3 (Conv2D)            (None, None, 1, 64)       5184      
_________________________________________________________________
activation_3 (Activation)    (None, None, 1, 64)       0         
_________________________________________________________________
dropout_5 (Dropout)          (None, None, 1, 64)       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, None, 1, 64)       73792     
_________________________________________________________________
activation_4 (Activation)    (None, None, 1, 64)       0         
_________________________________________________________________
dropout_6 (Dropout)          (None, None, 1, 64)       0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, None, 1, 64)       61504     
_________________________________________________________________
activation_5 (Activation)    (None, None, 1, 64)       0         
_________________________________________________________________
dropout_7 (Dropout)          (None, None, 1, 64)       0         
_________________________________________________________________
global_max_pooling2d_2 (Glob (None, 64)                0         
_________________________________________________________________
dense_5 (Dense)              (None, 128)               8320      
_________________________________________________________________
dropout_8 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 64)                8256      
_________________________________________________________________
dropout_9 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_7 (Dense)              (None, 16)                1040      
_________________________________________________________________
dense_8 (Dense)              (None, 1)                 17        
=================================================================
Total params: 158,113
Trainable params: 158,113
Non-trainable params: 0

我可以使用以下方法训练该模型:

epochs = 10
for y in range(epochs):
    print(y)
    for x in range(len(x_train)):
        model.train_on_batch(x_train[x], np.array(y_train.iloc[x]).reshape((1,1)), sample_weight=None, class_weight=None)

我不明白为什么使用train_on_batch导致iPython内核崩溃而导致model.fit崩溃的原因。在具有相同序列数的模型上调用model.fit,并且都对其进行了填充,这意味着与train_on_batch模型相比,具有更多的参数可以训练。

使用model.fit时,有时我会使用多达200个批处理大小,而使用model.train_on_batch进行训练的方式与使用1个批处理大小本质上是相同的吗?

编辑:在解释器中运行时出现的错误是:

2019-07-02 18:29:40.689406: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9051 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
    2019-07-02 18:29:51.843913: F tensorflow/stream_executor/cuda/cuda_dnn.cc:542] Check failed: cudnnSetTensorNdDescriptor(handle_.get(), elem_type, nd, dims.data(), strides.data()) == CUDNN_STATUS_SUCCESS (3 vs. 0)batch_descriptor: {count: 1 feature_map_count: 128 spatial: 0 1  value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
    Aborted (core dumped)

0 个答案:

没有答案