Question

Python代码使用Tensor Flow和Keras对汽车图像进行分类。 0 =不是汽车。 1 =汽车。我对结果有些困惑。我的数据集包含1513个jpg图像。结果似乎未显示1513个读数？此外，结果不正确。例如，结果变为“ 0 0 0 0 1 0 0 0 0”，当在现实中时，前10张图像应全部为“ 1”，因为前10张图像均为汽车。

我可以做些什么使结果更清楚吗？亲切的问候。

from keras.models import Sequential # Initialise our neural network model as a sequential network
from keras.layers import Conv2D # Convolution operation
from keras.layers import MaxPooling2D # Maxpooling function
from keras.layers import Flatten # Converting 2D arrays into a 1D linear vector.
from keras.layers import Dense # Perform the full connection of the neural network
from keras.preprocessing.image import ImageDataGenerator
from IPython.display import display
from PIL import Image
import cv2
import numpy as np
from sklearn.metrics import accuracy_score
from skimage import io, transform

def cnn_classifier():
    cnn = Sequential()
    cnn.add(Conv2D(8, (3,3), input_shape = (50, 50, 3), padding='same', activation = 'relu'))
    cnn.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
    cnn.add(Conv2D(16, (3,3), padding='same', activation = 'relu'))
    cnn.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
    cnn.add(Flatten())
    cnn.add(Dense(128, activation = 'relu'))
    cnn.add(Dense(2, activation = 'softmax'))
    cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
    print(cnn.summary())
    return cnn

def reshaped_image(image):
    return transform.resize(image,(50,50,3)) # (cols (width), rows (height)) and don't use np.resize()

def load_images_from_folder():
    Images = os.listdir("./Dataset/")
    train_images = []
    train_labels = []
    for image in Images:
            if image[-4:] == 'jpeg':
                path = os.path.join("./Dataset/", image)
                img = cv2.imread(path)
                train_images.append(reshaped_image(img))
                label_file = image[:-5] + '.txt'
                with open("./Dataset"+"/"+label_file) as f:
                    content = f.readlines()
                    label = int(float(content[0]))
                    l = [0, 0]
                    l[label] = 1 # 1=car and 0=not car
                    train_labels.append(l)
    return np.array(train_images), np.array(train_labels)

def train_test_split(train_data, train_labels, fraction):
    index = int(len(train_data)*fraction)
    return train_data[:index], train_labels[:index], train_data[index:], train_labels[index:]

train_data, train_labels = load_images_from_folder()
fraction = 0.8
train_data, train_labels, test_data, test_labels = train_test_split(train_data, train_labels, fraction)
print ("Train data size: ", len(train_data))
print ("Test data size: ", len(test_data))

cnn = cnn_classifier()

print ("Train data shape: ", train_data.shape)
print ("Test data shape: ", train_labels.shape)

idx = np.random.permutation(train_data.shape[0])
cnn.fit(train_data[idx], train_labels[idx], epochs = 10)
predicted_test_labels = np.argmax(cnn.predict(test_data), axis=1)
test_labels = np.argmax(test_labels, axis=1)

print ("Actual test labels:", test_labels)
print ("Predicted test labels:", predicted_test_labels)

print ("Accuracy score:", accuracy_score(test_labels, predicted_test_labels))

结果

Train data size:  1210
Test data size:  303
Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_9 (Conv2D)            (None, 50, 50, 8)         224       
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 25, 25, 8)         0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 25, 25, 16)        1168      
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 13, 13, 16)        0         
_________________________________________________________________
flatten_5 (Flatten)          (None, 2704)              0         
_________________________________________________________________
dense_9 (Dense)              (None, 128)               346240    
_________________________________________________________________
dense_10 (Dense)             (None, 2)                 258       
=================================================================
Total params: 347,890
Trainable params: 347,890
Non-trainable params: 0
_________________________________________________________________
None
Train data shape:  (1210, 50, 50, 3)
Test data shape:  (1210, 2)
Epoch 1/10
1210/1210 [==============================] - 1s 433us/step - loss: 0.4682 - accuracy: 0.8331
Epoch 2/10
1210/1210 [==============================] - 0s 300us/step - loss: 0.2686 - accuracy: 0.9066
Epoch 3/10
1210/1210 [==============================] - 0s 320us/step - loss: 0.1746 - accuracy: 0.9421
Epoch 4/10
1210/1210 [==============================] - 0s 302us/step - loss: 0.1177 - accuracy: 0.9595
Epoch 5/10
1210/1210 [==============================] - 0s 311us/step - loss: 0.1105 - accuracy: 0.9620
Epoch 6/10
1210/1210 [==============================] - 0s 298us/step - loss: 0.1019 - accuracy: 0.9645
Epoch 7/10
1210/1210 [==============================] - 0s 302us/step - loss: 0.0695 - accuracy: 0.9752
Epoch 8/10
1210/1210 [==============================] - 0s 309us/step - loss: 0.0672 - accuracy: 0.9777
Epoch 9/10
1210/1210 [==============================] - 0s 295us/step - loss: 0.0503 - accuracy: 0.9826
Epoch 10/10
1210/1210 [==============================] - 0s 304us/step - loss: 0.0348 - accuracy: 0.9893
Actual test labels: [0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0
 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0
 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0
 0 0 0 0 0 0 0]
Predicted test labels: [0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 0
 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0
 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0
 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0
 0 0 0 0 0 0 0]
Accuracy score: 0.9405940594059405

Answer 1

在您的情况下，您正在尝试使用一个热矩阵对标签进行二进制分类。使用categorical_crossentropy而不是binary_crossentropy编译模型，因为您的标签采用一种热门形式。

对于更好的模型，我建议您使用1神经元更改最后一层，并使用sigmoid代替softmax进行二进制分类，并且可以使用binary_crossentropy作为损失函数。

您可以阅读this以获得更多详细信息

图片分类-如何使结果更清晰？

1 个答案: