我正在用Keras和Tensorflow后端训练DNA序列列表上的卷积神经网络,以预测它们的某些特征。我设置了两个网络:
填充我的一键编码序列的开头,以便所有序列都具有相同的长度,这意味着我可以使用model.fit将它们一次通过。由于我无法使用Conv2D掩盖输入内容,因此填充减少了我的训练数据。
按序列训练模型序列,并且不填充它们,因此模型采用可变的输入长度。我希望这样做可以提高模型的准确性,但是每次我遍历序列列表并调用model.train_on_batch时,我的iPython内核都会崩溃。
这是我传递填充输入的模型:
kernel_length=50
kernel_length2=25
kernel_length3=12
conv_layers=128
hot_dim=5
model = Sequential()
model.add(Conv2D(conv_layers, (kernel_length, hot_dim), padding='valid', input_shape=(x_train.shape[1], hot_dim, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Conv2D(conv_layers, (kernel_length2, 1), padding='valid', input_shape=(x_train.shape[1], hot_dim, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Conv2D(conv_layers, (kernel_length3, 1), padding='valid', input_shape=(x_train.shape[1], hot_dim, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(MaxPooling2D(pool_size=(35,1), padding='valid'))
model.add(Flatten())
model.add(Dense(128, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(64, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(16, activation="relu"))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam',metrics=["accuracy", "mae", "mse"])
model.summary()
模型摘要返回:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_60 (Conv2D) (None, 851, 1, 128) 32128
_________________________________________________________________
activation_58 (Activation) (None, 851, 1, 128) 0
_________________________________________________________________
dropout_92 (Dropout) (None, 851, 1, 128) 0
_________________________________________________________________
conv2d_61 (Conv2D) (None, 827, 1, 128) 409728
_________________________________________________________________
activation_59 (Activation) (None, 827, 1, 128) 0
_________________________________________________________________
dropout_93 (Dropout) (None, 827, 1, 128) 0
_________________________________________________________________
conv2d_62 (Conv2D) (None, 816, 1, 128) 196736
_________________________________________________________________
activation_60 (Activation) (None, 816, 1, 128) 0
_________________________________________________________________
dropout_94 (Dropout) (None, 816, 1, 128) 0
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 23, 1, 128) 0
_________________________________________________________________
flatten_16 (Flatten) (None, 2944) 0
_________________________________________________________________
dense_69 (Dense) (None, 128) 376960
_________________________________________________________________
dropout_95 (Dropout) (None, 128) 0
_________________________________________________________________
dense_70 (Dense) (None, 64) 8256
_________________________________________________________________
dropout_96 (Dropout) (None, 64) 0
_________________________________________________________________
dense_71 (Dense) (None, 16) 1040
_________________________________________________________________
dense_72 (Dense) (None, 1) 17
=================================================================
Total params: 1,024,865
Trainable params: 1,024,865
Non-trainable params: 0
_________________________________________________________________
我的批处理模型略有不同,因为我必须使用GlobalMaxPooling2D
:
kernel_length=20
kernel_length2=18
kernel_length3=15
conv_layers=64
model = Sequential()
model.add(Conv2D(conv_layers, (kernel_length, 4), padding='valid', input_shape=(None, 4, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Conv2D(conv_layers, (kernel_length2, 1), padding='valid', input_shape=(None, 4, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Conv2D(conv_layers, (kernel_length3, 1), padding='valid', input_shape=(None, 4, 1)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(GlobalMaxPooling2D())
model.add(Dense(128, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(64, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(16, activation="relu"))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam',metrics=["accuracy", "mae", "mse"])
model.summary()
对此的模型摘要是
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_3 (Conv2D) (None, None, 1, 64) 5184
_________________________________________________________________
activation_3 (Activation) (None, None, 1, 64) 0
_________________________________________________________________
dropout_5 (Dropout) (None, None, 1, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, None, 1, 64) 73792
_________________________________________________________________
activation_4 (Activation) (None, None, 1, 64) 0
_________________________________________________________________
dropout_6 (Dropout) (None, None, 1, 64) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, None, 1, 64) 61504
_________________________________________________________________
activation_5 (Activation) (None, None, 1, 64) 0
_________________________________________________________________
dropout_7 (Dropout) (None, None, 1, 64) 0
_________________________________________________________________
global_max_pooling2d_2 (Glob (None, 64) 0
_________________________________________________________________
dense_5 (Dense) (None, 128) 8320
_________________________________________________________________
dropout_8 (Dropout) (None, 128) 0
_________________________________________________________________
dense_6 (Dense) (None, 64) 8256
_________________________________________________________________
dropout_9 (Dropout) (None, 64) 0
_________________________________________________________________
dense_7 (Dense) (None, 16) 1040
_________________________________________________________________
dense_8 (Dense) (None, 1) 17
=================================================================
Total params: 158,113
Trainable params: 158,113
Non-trainable params: 0
我可以使用以下方法训练该模型:
epochs = 10
for y in range(epochs):
print(y)
for x in range(len(x_train)):
model.train_on_batch(x_train[x], np.array(y_train.iloc[x]).reshape((1,1)), sample_weight=None, class_weight=None)
我不明白为什么使用train_on_batch导致iPython内核崩溃而导致model.fit崩溃的原因。在具有相同序列数的模型上调用model.fit,并且都对其进行了填充,这意味着与train_on_batch模型相比,具有更多的参数可以训练。
使用model.fit时,有时我会使用多达200个批处理大小,而使用model.train_on_batch进行训练的方式与使用1个批处理大小本质上是相同的吗?
编辑:在解释器中运行时出现的错误是:
2019-07-02 18:29:40.689406: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9051 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
2019-07-02 18:29:51.843913: F tensorflow/stream_executor/cuda/cuda_dnn.cc:542] Check failed: cudnnSetTensorNdDescriptor(handle_.get(), elem_type, nd, dims.data(), strides.data()) == CUDNN_STATUS_SUCCESS (3 vs. 0)batch_descriptor: {count: 1 feature_map_count: 128 spatial: 0 1 value_min: 0.000000 value_max: 0.000000 layout: BatchDepthYX}
Aborted (core dumped)