Keras使用什么损失函数?

时间:2018-07-28 21:12:46

标签: python tensorflow machine-learning keras

import numpy as np
from random import randint
from sklearn.preprocessing import MinMaxScaler
import keras
from keras import backend as K
from keras.models import Sequential
from keras.layers import Activation
from keras.layers.core import Dense
from keras.optimizers import Adam
from keras.metrics import categorical_crossentropy

train_labels = [68, 65, 67, 71, 69, 72, 75, 70, 85, 83, 88, 80, 80, 78, 79, 85, 88, 86, 92, 91, 91, 93, 93, 90, 96, 97, 100, 100]

train_samples = [[2, 1, 73],[4, 0.5, 65],[3, 1, 70],[6, 1, 75],[7, 0.5, 68],[9, 1, 72],[3, 5, 70],[2, 6, 65],[4, 5, 78],[8, 3, 75],[9, 2, 80],[9, 4, 69],[2, 2, 88],[3, 1, 85],[7, 1, 83],[9, 1, 87],[3, 5, 88],[2, 7, 84],[7, 3, 88],[9, 4, 85],[4, 1, 93],[3, 1, 95],[8, 1, 93], [9, 0.5, 92], [3, 5, 94], [2, 7, 96], [8, 4, 97], [7, 5, 94]]

train_labels = np.array(train_labels)
train_samples = np.array(train_samples)

model = Sequential([
  Dense(8, input_shape=(3,)),
  Dense(16),
  Dense(1)
])

print(model.summary())

model.compile(Adam(lr=0.0001), loss="????", metrics = ["accuracy"])

model.fit(train_samples, train_labels, validation_split = 0.1, batch_size=1, epochs=100, verbose = 2)

我正在尝试训练NN,以根据课程中的睡眠时间,学习时间和当前平均水平预测测试成绩。我对NN很了解,所以我不知道要使用什么损失函数。我做了一些以下的NN教程,准确度始终在95%左右,但是,无论我使用什么损失函数,准确度都是0。有人知道这是否是因为我没有扩展我的训练集,还是知道损失函数使用?谢谢。

2 个答案:

答案 0 :(得分:3)

您要预测的目标存在于连续空间中(回归,而不是分类)。损失应为"mse" / "mean_squared_error"metric = ["mse"]

对于一般的神经网络,建议将输入大致归一化为mean = 0std = 1。您可以轻松地使用scikit-learn的sklearn.preprocessing.StandardScaler()实现这一目标。

您的项目的重构代码如下所示(在tensorflow-cpu==1.9keras==2.2.2上进行了测试):

import numpy as np
from sklearn.preprocessing import StandardScaler
import keras
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
print(f"Tensorflow version: {tf.__version__}, Keras version: {keras.__version__}")

# data
train_labels = [68, 65, 67, 71, 69, 72, 75, 70, 85, 83, 88, 80, 80, 78, 79, 85, 88, 86, 92, 91, 91, 93, 93, 90, 96, 97, 100, 100]
train_samples = [[2, 1, 73],[4, 0.5, 65],[3, 1, 70],[6, 1, 75],[7, 0.5, 68],[9, 1, 72],[3, 5, 70],[2, 6, 65],[4, 5, 78],[8, 3, 75],[9, 2, 80],[9, 4, 69],[2, 2, 88],[3, 1, 85],[7, 1, 83],[9, 1, 87],[3, 5, 88],[2, 7, 84],[7, 3, 88],[9, 4, 85],[4, 1, 93],[3, 1, 95],[8, 1, 93], [9, 0.5, 92], [3, 5, 94], [2, 7, 96], [8, 4, 97], [7, 5, 94]]
train_labels = np.array(train_labels)
train_samples = np.array(train_samples)
# preprocessing (min-max or standard scaler is fine)
sc = StandardScaler()
train_samples_scaled = sc.fit_transform(train_samples)
print(f"Feature means before scaling: {train_samples.mean(axis=0)}, Feature means after scaling: {train_samples_scaled.mean(axis=0)}")

# neural network
model = Sequential([
  Dense(8, input_shape=(3,), activation='relu'),
  Dense(16, activation='relu'),
  Dense(1, activation='linear')
])
model.compile(Adam(lr=0.001), loss="mse", metrics = ["mse"])
# training
model.fit(train_samples_scaled, train_labels, validation_split = 0.1, batch_size=5, epochs=20, verbose = 2)

答案 1 :(得分:1)

您正在尝试预测实数(测试标记),因此您正在处理回归问题。您将要使用'mean_squared_error'作为损失函数,并跟踪mse作为度量标准而不是准确性。

严格地缩放数据(例如0到1之间)并不是必须的,但是它可以帮助您的网络更快地收敛。