我正在尝试使用Keras及其MobileNet实现来进行对象本地化(输出一些功能的x / y坐标,而不是类),我遇到了一些可能非常基本的问题,我不能弄清楚。
我的代码如下所示:
# =============================
# Load MobileNet and change the top layers.
model = applications.MobileNet(weights="imagenet",
include_top=False,
input_shape=(224, 224, 3))
# Freeze all the layers except the very last 5.
for layer in model.layers[:-5]:
layer.trainable = False
# Adding custom Layers at the end, after the last Conv2D layer.
x = model.output
x = GlobalAveragePooling2D()(x)
x = Reshape((1, 1, 1024))(x)
x = Dropout(0.5)(x)
x = Conv2D(1024, (1, 1), activation='relu', padding='same', name='conv_preds')(x)
x = Dense(1024, activation="relu")(x)
# I'd like this to output 4 variables, two pairs of x/y coordinates
x = Dense(PREDICT_SIZE, activation="sigmoid")(x)
predictions = Reshape((PREDICT_SIZE,))(x)
# =============================
# Create the new final model.
model_final = Model(input = model.input, output = predictions)
def custom_loss(y_true, y_pred):
'''Trying to compute the Euclidian distance as a Loss Function'''
return K.sqrt(K.sum(K.square(y_true - y_pred), axis=-1))
model_final.compile(loss = custom_loss,
optimizer = optimizers.adam(lr=0.0001),
metrics=["accuracy"])
使用此模型,然后我加载数据并尝试训练它。
x_train, y_train, x_val, y_val = load_data(DATASET_DIR)
# This load_data is my own implementation. It returns the images
# as tensors.
# ==> x_train[0].shape= (224, 224, 3)
#
# y_train and y_val look like this:
# ==> y_train[0]= [ 0.182 -0.0933 0.072 -0.0453]
#
# holding values in the [0, 1] interval for where the pixel
# is relative to the width/height of the image.
#
model_final.fit(x_train, y_train,
batch_size=batch_size, epochs=5, shuffle=False,
validation_data=(x_val, y_val))
不幸的是,当我运行这个模型进行训练时,我得到的结果是:
Train on 45 samples, validate on 5 samples
Epoch 1/5
16/45 [=========>....................] - ETA: 2s - loss: nan - acc: 0.0625
32/45 [====================>.........] - ETA: 1s - loss: nan - acc: 0.0312
45/45 [==============================] - 4s - loss: nan - acc: 0.0222 - val_loss: nan - val_acc: 0.0000e+00
Epoch 2/5
16/45 [=========>....................] - ETA: 2s - loss: nan - acc: 0.0625
32/45 [====================>.........] - ETA: 1s - loss: nan - acc: 0.0312
45/45 [==============================] - 4s - loss: nan - acc: 0.0222 - val_loss: nan - val_acc: 0.0000e+00
Epoch 3/5
我不知道为什么我的损失值是“纳”。我一定做错了,我试图改变一切 - 损失功能,输出的形状......但我无法弄清楚我做错了什么。
任何帮助将不胜感激!
更新:好像问题出现在我的load_data中。
如果我创建这样的图像数据,它会失败并导致丢失:nan
i = pil_image.open(img_filename)
img = image.load_img(img_filename, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = keras.applications.mobilenet.preprocess_input(x)
x_train = np.append(x_train, x, axis=0)
但如果我做一些像这样的微不足道的事情,'fit'就可以正常工作并计算损失的实际价值:
x_train = np.random.random((100, 224, 224, 3))
叹息我不知道发生了什么......
更新#2:我弄清楚问题是什么
在此处记录以防万一它可以帮助任何人。
为MobileNet正确生成输入张量的方法是:
test_img=[]
for i in range(len(test)):
temp_img=image.load_img(test_path+test['filename'][i],target_size=(224,224))
temp_img=image.img_to_array(temp_img)
test_img.append(temp_img)
test_img=np.array(test_img)
test_img=preprocess_input(test_img)
注意如何将它变成numpy.array并在整批图像上运行preprocess_input。按图像进行图像似乎没有用(我以前做过的)。
希望有一天能帮到某人。