我正在使用Keras和Tensorflow训练一个模型,该模型根据一些字母的图像预测匹配的字体。我的文件夹包含一个单独的文件夹中的数据,每个字母的图像都以不同的形式出现。我用于训练模型的代码如下:
LETTER_IMAGES_FOLDER = "datasets"
MODEL_FILENAME = "fonts_model.hdf5"
MODEL_LABELS_FILENAME = "model_labels.dat"
data = pd.read_csv('annotations.csv')
paths = list(data['Path'].values)
Y = list(data['Font'].values)
encoder = LabelEncoder()
encoder.fit(Y)
Y = encoder.transform(Y)
Y = np_utils.to_categorical(Y)
data = []
# loop over the input images
for image_file in paths:
# Load the image and convert it to grayscale
image = cv2.imread(image_file)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Add a third channel dimension to the image to make Keras happy
image = np.expand_dims(image, axis=2)
# Add the letter image and it's label to our training data
data.append(image)
data = np.array(data, dtype="float") / 255.0
train_x, test_x, train_y, test_y = model_selection.train_test_split(data,Y,test_size = 0.1, random_state = 0)
# Save the mapping from labels to one-hot encodings.
# We'll need this later when we use the model to decode what it's predictions mean
with open(MODEL_LABELS_FILENAME, "wb") as f:
pickle.dump(encoder, f)
# Build the neural network!
model = Sequential()
# First convolutional layer with max pooling
model.add(Conv2D(20, (5, 5), padding="same", input_shape=(100, 100, 1), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# Second convolutional layer with max pooling
model.add(Conv2D(50, (5, 5), padding="same", activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Flatten())
model.add(Dense(500, activation="relu"))
print (len(encoder.classes_))
model.add(Dense(len(encoder.classes_), activation="softmax"))
# Ask Keras to build the TensorFlow model behind the scenes
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
# Train the neural network
model.fit(train_x, train_y, validation_data=(test_x, test_y), batch_size=32, epochs=2, verbose=1)
# Save the trained model to disk
model.save(MODEL_FILENAME)
一旦创建了模型,我将对其进行如下预测:
predictions = model.predict(letter_image)
print (predictions) # this has the length of 1
问题是“预测”始终是大小为1的数组,我不确定为什么。我使用的是softmax,categorical_crossentropy,并且我的Dense值在最后一层大于1。有人可以告诉我为什么我在这里没有得到前n个预测吗?
我也尝试过使用Sigmoid进行binary_crossentropy,但是得到相同的结果。我认为还有更多我想念的东西。