我正在制作一个用于在keras上使用CNN识别阿拉伯字符的程序,然后我尝试了该模型,并尝试了不同的体系结构,甚至考虑了数据集创建者提出的那种体系结构。问题是,当我对数据集包含的test_data进行预测时会获得良好的结果,但是当我尝试使用输入的实际图像或画布(通过制作webapp)生成的实际图像进行预测时,无论何时我尝试过的图片数量。
我已经以良好的准确性和更少的损失保存并加载了模型,并且我已经使用openCV lib上传了图像并进行了整形,以使其适合模型并使其成为灰度,然后将其转换为数组,并将其输入到预测函数,输出是错误的,相比之下,我已经将test_data与标签一起加载并输入到模型中,从而得到真实的结果
所以这是我的代码,从加载数据集到训练test_data结果到images_input错误结果
# Training letters images and labels files
letters_training_images_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Characters Dataset CSV/training images.zip"
letters_training_labels_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Characters Dataset CSV/training labels.zip"
# Testing letters images and labels files
letters_testing_images_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Characters Dataset CSV/testing images.zip"
letters_testing_labels_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Characters Dataset CSV/testing labels.zip"
# Loading dataset into dataframes
training_letters_images = pd.read_csv(letters_training_images_file_path, compression='zip', header=None)
training_letters_labels = pd.read_csv(letters_training_labels_file_path, compression='zip', header=None)
testing_letters_images = pd.read_csv(letters_testing_images_file_path, compression='zip', header=None)
testing_letters_labels = pd.read_csv(letters_testing_labels_file_path, compression='zip', header=None)
# Training digits images and labels files
digits_training_images_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Digits Dataset CSV/training images.zip"
digits_training_labels_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Digits Dataset CSV/training labels.zip"
# Testing digits images and labels files
digits_testing_images_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Digits Dataset CSV/testing images.zip"
digits_testing_labels_file_path = "drive/My Drive/ARlearning/Arabic Handwritten Digits Dataset CSV/testing labels.zip"
# Loading dataset into dataframes
training_digits_images = pd.read_csv(digits_training_images_file_path, compression='zip', header=None)
training_digits_labels = pd.read_csv(digits_training_labels_file_path, compression='zip', header=None)
testing_digits_images = pd.read_csv(digits_testing_images_file_path, compression='zip', header=None)
testing_digits_labels = pd.read_csv(digits_testing_labels_file_path, compression='zip', header=None)
training_digits_images_scaled = training_digits_images.values.astype('float32')/255
training_digits_labels = training_digits_labels.values.astype('int32')
testing_digits_images_scaled = testing_digits_images.values.astype('float32')/255
testing_digits_labels = testing_digits_labels.values.astype('int32')
training_letters_images_scaled = training_letters_images.values.astype('float32')/255
training_letters_labels = training_letters_labels.values.astype('int32')
testing_letters_images_scaled = testing_letters_images.values.astype('float32')/255
testing_letters_labels = testing_letters_labels.values.astype('int32')
print("Training images of digits after scaling")
print(training_digits_images_scaled.shape)
training_digits_images_scaled[0:5]
print("Training images of letters after scaling")
print(training_letters_images_scaled.shape)
training_letters_images_scaled[0:5]
# one hot encoding
# number of classes = 10 (digits classes) + 28 (arabic alphabet classes)
number_of_classes = 38
training_letters_labels_encoded = to_categorical(training_letters_labels, num_classes=number_of_classes)
testing_letters_labels_encoded = to_categorical(testing_letters_labels, num_classes=number_of_classes)
training_digits_labels_encoded = to_categorical(training_digits_labels, num_classes=number_of_classes)
testing_digits_labels_encoded = to_categorical(testing_digits_labels, num_classes=number_of_classes)
# reshape input digit images to 64x64x1
training_digits_images_scaled = training_digits_images_scaled.reshape([-1, 64, 64, 1])
testing_digits_images_scaled = testing_digits_images_scaled.reshape([-1, 64, 64, 1])
# reshape input letter images to 64x64x1
training_letters_images_scaled = training_letters_images_scaled.reshape([-1, 64, 64, 1])
testing_letters_images_scaled = testing_letters_images_scaled.reshape([-1, 64, 64, 1])
print(training_digits_images_scaled.shape, training_digits_labels_encoded.shape, testing_digits_images_scaled.shape, testing_digits_labels_encoded.shape)
print(training_letters_images_scaled.shape, training_letters_labels_encoded.shape, testing_letters_images_scaled.shape, testing_letters_labels_encoded.shape)
training_data_images = np.concatenate((training_digits_images_scaled, training_letters_images_scaled), axis=0)
training_data_labels = np.concatenate((training_digits_labels_encoded, training_letters_labels_encoded), axis=0)
print("Total Training images are {} images of shape".format(training_data_images.shape[0]))
print(training_data_images.shape, training_data_labels.shape)
testing_data_images = np.concatenate((testing_digits_images_scaled, testing_letters_images_scaled), axis=0)
testing_data_labels = np.concatenate((testing_digits_labels_encoded, testing_letters_labels_encoded), axis=0)
print("Total Testing images are {} images of shape".format(testing_data_images.shape[0]))
print(testing_data_images.shape, testing_data_labels.shape)
def create_model(optimizer='adam', kernel_initializer='he_normal', activation='relu'):
# create model
model = Sequential()
model.add(Conv2D(filters=16, kernel_size=3, padding='same', input_shape=(64, 64, 1), kernel_initializer=kernel_initializer, activation=activation))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=32, kernel_size=3, padding='same', kernel_initializer=kernel_initializer, activation=activation))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=64, kernel_size=3, padding='same', kernel_initializer=kernel_initializer, activation=activation))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=128, kernel_size=3, padding='same', kernel_initializer=kernel_initializer, activation=activation))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(GlobalAveragePooling2D())
#Fully connected final layer
model.add(Dense(38, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer=optimizer)
return model
model = create_model()
model.summary()
model = create_model(optimizer='Adam', kernel_initializer='normal', activation='relu')
epochs = 20
batch_size = 20
checkpointer = ModelCheckpoint(filepath='weights.hdf5', verbose=1, save_best_only=True)
history = model.fit(training_data_images, training_data_labels,
validation_data=(testing_data_images, testing_data_labels),
epochs=epochs, batch_size=batch_size, verbose=1, callbacks=[checkpointer])
培训结果:
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Train on 73440 samples, validate on 13360 samples
Epoch 1/10
73440/73440 [==============================] - 52s 702us/step - loss: 0.3535 - acc: 0.9062 - val_loss: 0.2023 - val_acc: 0.9236
Epoch 00001: val_loss improved from inf to 0.20232, saving model to weights.hdf5
Epoch 2/10
73440/73440 [==============================] - 48s 658us/step - loss: 0.1068 - acc: 0.9672 - val_loss: 0.1701 - val_acc: 0.9469
Epoch 00002: val_loss improved from 0.20232 to 0.17013, saving model to weights.hdf5
Epoch 3/10
73440/73440 [==============================] - 49s 667us/step - loss: 0.0799 - acc: 0.9753 - val_loss: 0.1112 - val_acc: 0.9707
Epoch 00003: val_loss improved from 0.17013 to 0.11123, saving model to weights.hdf5
Epoch 4/10
73440/73440 [==============================] - 47s 638us/step - loss: 0.0684 - acc: 0.9786 - val_loss: 0.0715 - val_acc: 0.9800
Epoch 00004: val_loss improved from 0.11123 to 0.07150, saving model to weights.hdf5
Epoch 5/10
73440/73440 [==============================] - 48s 660us/step - loss: 0.0601 - acc: 0.9812 - val_loss: 0.2134 - val_acc: 0.9343
Epoch 00005: val_loss did not improve from 0.07150
Epoch 6/10
73440/73440 [==============================] - 47s 647us/step - loss: 0.0545 - acc: 0.9828 - val_loss: 0.0641 - val_acc: 0.9814
Epoch 00006: val_loss improved from 0.07150 to 0.06413, saving model to weights.hdf5
Epoch 7/10
73440/73440 [==============================] - 48s 655us/step - loss: 0.0490 - acc: 0.9846 - val_loss: 0.8639 - val_acc: 0.7332
Epoch 00007: val_loss did not improve from 0.06413
Epoch 8/10
73440/73440 [==============================] - 48s 660us/step - loss: 0.0472 - acc: 0.9854 - val_loss: 0.0509 - val_acc: 0.9844
Epoch 00008: val_loss improved from 0.06413 to 0.05093, saving model to weights.hdf5
Epoch 9/10
73440/73440 [==============================] - 47s 644us/step - loss: 0.0433 - acc: 0.9859 - val_loss: 0.0713 - val_acc: 0.9791
Epoch 00009: val_loss did not improve from 0.05093
Epoch 10/10
73440/73440 [==============================] - 49s 665us/step - loss: 0.0434 - acc: 0.9861 - val_loss: 0.2861 - val_acc: 0.9012
Epoch 00010: val_loss did not improve from 0.05093
并在使用test_data评估模型后
测试准确度:0.9843562874251497
测试损失:0.05093173268935584
现在尝试从test_data预测类
def get_predicted_classes(model, data, labels=None):
image_predictions = model.predict(data)
predicted_classes = np.argmax(image_predictions, axis=1)
true_classes = np.argmax(labels, axis=1)
return predicted_classes, true_classes
from sklearn.metrics import classification_report
def get_classification_report(y_true, y_pred):
print(classification_report(y_true, y_pred))
y_pred, y_true = get_predicted_classes(model, testing_data_images, testing_data_labels)
get_classification_report(y_true, y_pred)
precision recall f1-score support
0 0.98 0.99 0.99 1000
1 0.99 0.99 0.99 1000
2 0.98 1.00 0.99 1000
3 1.00 0.99 0.99 1000
4 1.00 0.99 0.99 1000
5 0.99 0.98 0.99 1000
6 0.99 0.99 0.99 1000
7 1.00 0.99 1.00 1000
8 1.00 0.99 1.00 1000
9 1.00 0.99 0.99 1000
10 0.99 1.00 1.00 120
11 1.00 0.97 0.99 120
12 0.87 0.97 0.91 120
13 1.00 0.89 0.94 120
14 0.98 0.99 0.98 120
15 0.96 0.98 0.97 120
16 0.99 0.97 0.98 120
17 0.91 0.99 0.95 120
18 0.94 0.91 0.92 120
19 0.94 0.93 0.93 120
20 0.96 0.90 0.93 120
21 0.99 0.93 0.96 120
22 0.99 1.00 1.00 120
23 0.91 0.99 0.95 120
24 0.99 0.96 0.97 120
25 0.96 0.96 0.96 120
26 0.95 0.96 0.95 120
27 0.99 0.97 0.98 120
28 0.99 0.99 0.99 120
29 0.95 0.84 0.89 120
30 0.84 0.97 0.90 120
31 0.98 0.98 0.98 120
32 0.98 1.00 0.99 120
33 0.99 1.00 1.00 120
34 0.96 0.90 0.93 120
35 0.99 0.96 0.97 120
36 0.95 0.97 0.96 120
37 0.98 0.99 0.99 120
micro avg 0.98 0.98 0.98 13360
macro avg 0.97 0.97 0.97 13360
weighted avg 0.98 0.98 0.98 13360
以及使用input_image进行的预测
x = imread('output.png', mode='L')
x = np.invert(x)
x = imresize(x, (64, 64))
#x = x/255
x = x.reshape((-1,64,64,1))
with graphAR.as_default():
out = modelAR.predict(x)
#print(out)
print(np.argmax(out, axis=1))
response = np.array_str(np.argmax(out, axis=1))
print(response)
但结果始终为假(错误)
例如,我期望输入图像的真实输出
预期的预测:alif-أ
结果:[[0]] = sifr-0
一些images_inputs我尝试过
答案 0 :(得分:0)
在训练阶段,您在训练之前正在使用这些功能。按比例缩放并转换为整数。
training_digits_images_scaled = training_digits_images.values.astype('float32')/255
training_digits_labels = training_digits_labels.values.astype('int32')
在预测期间,您必须执行相同的确切功能。在input_image的预测中,
#Convert to grayscale only if training images are in grayscale too.
#It's generally a good idea to train and predict with grayscaled images.
x = imread('output.png', mode='L')
# Not sure why you are doing this
#x = np.invert(x)
x = x.astype('float')/255
x = x.astype('int')
x = x.reshape((-1,64,64,1))
## Continue with prediction function
这应该有效。让我知道怎么回事。