结合LSTM和CNN的问题? (Python,Keras)

时间:2018-01-08 06:19:20

标签: python keras conv-neural-network lstm

我想预测8个字符的车牌,所以我在Keras写了下面的模型:

x = Input(shape=(HEIGHT, WIDTH, CHANNELS))
base_model = InceptionV3(include_top=False, weights='imagenet', input_shape=(HEIGHT, WIDTH, CHANNELS))
base_model.trainable = False
y = base_model(x)
y = Reshape((8, 9 * 256))(y)
y = LSTM(units=20, return_sequences='true')(y)
y = Dropout(0.5)(y)
y = TimeDistributed(Dense(TOTAL_CHARS, activation="softmax", activity_regularizer=regularizers.l2(REGUL_PARAM)))(y)
y = Dropout(0.25)(y)
model = Model(input=x, output=y)
model.compile(loss="categorical_crossentropy", optimizer='rmsprop', metrics=['accuracy'])

我有大约6000个用于训练的数据,我用ImageGenerator扩充它们。我的问题是损失和准确度在时间上大致保持不变:

************************************************************
Epoch: 1
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6856/6869 [============================>.] - ETA: 0s - loss: 5.4525 - acc: 0.1924Epoch 00001: val_loss improved from 2.17175 to 2.15020, saving model to ./trained_model_V10.hdf5
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4535 - acc: 0.1924 - val_loss: 2.1502 - val_acc: 0.2232
************************************************************
Epoch: 2
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6848/6869 [============================>.] - ETA: 0s - loss: 5.4543 - acc: 0.1959Epoch 00001: val_loss improved from 2.15020 to 2.11809, saving model to ./trained_model_V10.hdf5
6869/6869 [==============================] - 26s 4ms/step - loss: 5.4537 - acc: 0.1958 - val_loss: 2.1181 - val_acc: 0.2281
************************************************************
Epoch: 3
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6856/6869 [============================>.] - ETA: 0s - loss: 5.4284 - acc: 0.1977Epoch 00001: val_loss improved from 2.11809 to 2.09679, saving model to ./trained_model_V10.hdf5
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4282 - acc: 0.1978 - val_loss: 2.0968 - val_acc: 0.2304
************************************************************
Epoch: 4
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6856/6869 [============================>.] - ETA: 0s - loss: 5.4500 - acc: 0.2004Epoch 00001: val_loss did not improve
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4490 - acc: 0.2004 - val_loss: 2.1146 - val_acc: 0.2355
************************************************************
Epoch: 5
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6848/6869 [============================>.] - ETA: 0s - loss: 5.4399 - acc: 0.2006Epoch 00001: val_loss did not improve
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4374 - acc: 0.2009 - val_loss: 2.1102 - val_acc: 0.2324
************************************************************
Epoch: 6
************************************************************
Train on 6869 samples, validate on 1718 samples
Epoch 1/1
6856/6869 [============================>.] - ETA: 0s - loss: 5.4636 - acc: 0.1977Epoch 00001: val_loss improved from 2.09679 to 2.09076, saving model to ./trained_model_V10.hdf5
6869/6869 [==============================] - 25s 4ms/step - loss: 5.4629 - acc: 0.1978 - val_loss: 2.0908 - val_acc: 0.2341
************************************************************

现在,我不确定我模型的正确性,我认为问题是我的模型。这是组合CNN和LSTM的正确方法吗?

我也尝试过以下模式:

REGUL_PARAM = 0
image = Input(shape=(HEIGHT, WIDTH, CHANNELS))
x = Reshape((8, HEIGHT, int(WIDTH/8), CHANNELS))(image)
y = TimeDistributed(Conv2D(16, (3, 3), activation='relu', padding='same', activity_regularizer=regularizers.l2(REGUL_PARAM)))(x)
y = TimeDistributed(MaxPooling2D((2, 2)))(y)
y = TimeDistributed(Conv2D(32, (3, 3), activation='relu', padding='same', activity_regularizer=regularizers.l2(REGUL_PARAM)))(y)
y = TimeDistributed(MaxPooling2D((2, 2)))(y)
y = TimeDistributed(Conv2D(64, (3, 3), activation='relu', padding='same', activity_regularizer=regularizers.l2(REGUL_PARAM)))(y)
y = Reshape((int(y.shape[1]), int(y.shape[4]*y.shape[3]*y.shape[2])))(y)
y = Bidirectional(LSTM(units=50, return_sequences='true'))(y)
y = TimeDistributed(Dense(64, activity_regularizer=regularizers.l2(REGUL_PARAM), activation='relu'))(y)
y = Dropout(0.25)(y)
y = TimeDistributed(Dense(TOTAL_CHARS, activity_regularizer=regularizers.l2(REGUL_PARAM), activation='softmax'))(y)
y = Dropout(0.25)(y)

model = Model(inputs=image, outputs=y)

这个准确度大约是70%,但关键是即使我的数据中有一小部分也不能过度使用。

1 个答案:

答案 0 :(得分:1)

显然,您的模型效果不佳。

您可以查看此code

'''Train a recurrent convolutional network on the IMDB sentiment
classification task.
Gets to 0.8498 test accuracy after 2 epochs. 41s/epoch on K520 GPU.
'''
from __future__ import print_function

from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
from keras.layers import Conv1D, MaxPooling1D
from keras.datasets import imdb

# Embedding
max_features = 20000
maxlen = 100
embedding_size = 128

# Convolution
kernel_size = 5
filters = 64
pool_size = 4

# LSTM
lstm_output_size = 70

# Training
batch_size = 30
epochs = 2

'''
Note:
batch_size is highly sensitive.
Only 2 epochs are needed as the dataset is very small.
'''

print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')

print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

print('Build model...')

model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Conv1D(filters,
                 kernel_size,
                 padding='valid',
                 activation='relu',
                 strides=1))
model.add(MaxPooling1D(pool_size=pool_size))
model.add(LSTM(lstm_output_size))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

print('Train...')
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_data=(x_test, y_test))
score, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)