Question

我正在尝试使用Keras及其MobileNet实现来进行对象本地化（输出一些功能的x / y坐标，而不是类），我遇到了一些可能非常基本的问题，我不能弄清楚。

我的代码如下所示：

# =============================
# Load MobileNet and change the top layers.
model = applications.MobileNet(weights="imagenet",
                               include_top=False,
                               input_shape=(224, 224, 3))

# Freeze all the layers except the very last 5.
for layer in model.layers[:-5]:
  layer.trainable = False

# Adding custom Layers at the end, after the last Conv2D layer.
x = model.output

x = GlobalAveragePooling2D()(x)
x = Reshape((1, 1, 1024))(x)
x = Dropout(0.5)(x)
x = Conv2D(1024, (1, 1), activation='relu', padding='same', name='conv_preds')(x)
x = Dense(1024, activation="relu")(x)

# I'd like this to output 4 variables, two pairs of x/y coordinates
x = Dense(PREDICT_SIZE, activation="sigmoid")(x)
predictions = Reshape((PREDICT_SIZE,))(x)

# =============================
# Create the new final model.
model_final = Model(input = model.input, output = predictions)

def custom_loss(y_true, y_pred):
  '''Trying to compute the Euclidian distance as a Loss Function'''
  return K.sqrt(K.sum(K.square(y_true - y_pred), axis=-1))

model_final.compile(loss = custom_loss,
                    optimizer = optimizers.adam(lr=0.0001),
                    metrics=["accuracy"])

使用此模型，然后我加载数据并尝试训练它。

x_train, y_train, x_val, y_val = load_data(DATASET_DIR)

# This load_data is my own implementation. It returns the images
# as tensors.
# ==> x_train[0].shape= (224, 224, 3)
#
# y_train and y_val look like this:
# ==> y_train[0]= [ 0.182  -0.0933  0.072  -0.0453]
#
# holding values in the [0, 1] interval for where the pixel
# is relative to the width/height of the image.
#
model_final.fit(x_train, y_train,
                batch_size=batch_size, epochs=5, shuffle=False,
                validation_data=(x_val, y_val))

不幸的是，当我运行这个模型进行训练时，我得到的结果是：

Train on 45 samples, validate on 5 samples
Epoch 1/5
16/45 [=========>....................] - ETA: 2s - loss: nan - acc: 0.0625
32/45 [====================>.........] - ETA: 1s - loss: nan - acc: 0.0312
45/45 [==============================] - 4s - loss: nan - acc: 0.0222 - val_loss: nan - val_acc: 0.0000e+00
Epoch 2/5
16/45 [=========>....................] - ETA: 2s - loss: nan - acc: 0.0625
32/45 [====================>.........] - ETA: 1s - loss: nan - acc: 0.0312
45/45 [==============================] - 4s - loss: nan - acc: 0.0222 - val_loss: nan - val_acc: 0.0000e+00
Epoch 3/5

我不知道为什么我的损失值是“纳”。我一定做错了，我试图改变一切 - 损失功能，输出的形状......但我无法弄清楚我做错了什么。

任何帮助将不胜感激！

更新：好像问题出现在我的load_data中。

如果我创建这样的图像数据，它会失败并导致丢失：nan

i = pil_image.open(img_filename)
img = image.load_img(img_filename, target_size=(224, 224))

x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = keras.applications.mobilenet.preprocess_input(x)

x_train = np.append(x_train, x, axis=0)

但如果我做一些像这样的微不足道的事情，'fit'就可以正常工作并计算损失的实际价值：

x_train = np.random.random((100, 224, 224, 3))

叹息我不知道发生了什么......

更新＃2：我弄清楚问题是什么

在此处记录以防万一它可以帮助任何人。

为MobileNet正确生成输入张量的方法是：

test_img=[]
for i in range(len(test)):
    temp_img=image.load_img(test_path+test['filename'][i],target_size=(224,224))
    temp_img=image.img_to_array(temp_img)
    test_img.append(temp_img)

test_img=np.array(test_img) 
test_img=preprocess_input(test_img)

注意如何将它变成numpy.array并在整批图像上运行preprocess_input。按图像进行图像似乎没有用（我以前做过的）。

希望有一天能帮到某人。

Keras中用于对象定位提取的MobileNet传输学习 - 以NaN计算的损失

0 个答案: