我试图在神经网络上执行超参数优化,但是当我尝试大量隐藏层时,我的神经网络将始终预测相同的输出,因此我的(负)损失列表看起来像:
-0.789302627913455
-0.789302627913455
-0.789302627913455
-1
-0.789302627913455
-0.789302627913455
-1
-0.789302627913455
-0.789302627913455
-0.789302627913455
这是我的神经网络:
def nn(learningRate, layers, neurons, dropoutIn, dropoutHidden, miniBatch, activationFun, epoch):
x_data = []
y_data = []
x_data_train = []
y_data_train = []
x_data_test = []
y_data_test = []
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)
x_data = np.loadtxt('../4nodes_demand_vector')
x_data = x_data - x_data.min()
x_data = x_data / x_data.max() * 2
x_data = x_data - 1
y_data = np.loadtxt('../4nodes_vlink_vector')
input_dim = x_data.shape[1]
output_dim = y_data.shape[1]
split_ratio = 0.75
number_of_samples = x_data.shape[0]
# train data
x_data_train = x_data[:int(number_of_samples*split_ratio), ]
y_data_train = y_data[:int(number_of_samples*split_ratio), ]
# test data
x_data_test = x_data[int(number_of_samples*split_ratio):, ]
y_data_test = y_data[int(number_of_samples*split_ratio):, ]
adam = Adam(lr=learningRate)
model = Sequential()
model.add(Dropout(dropoutIn, input_shape=(input_dim,)))
model.add(Dense(units=neurons, input_shape=(input_dim,), kernel_constraint=maxnorm(3)))
for i in range(layers-1):
model.add(Dropout(dropoutHidden))
model.add(Dense(units=neurons, activation=activationFun, kernel_constraint=maxnorm(3)))
model.add(Dense(units=output_dim, activation='sigmoid'))
model.compile(loss='mean_squared_error',optimizer=adam)
model.fit(x_data_train, y_data_train, batch_size=miniBatch, validation_split=0.1, epochs=epoch, verbose=2)
predict = model.predict(x_data_test)
round_predict = np.round(predict)
correct = np.sum(np.all(round_predict == y_data_test, axis=1))
number_of_test_data = x_data_test.shape[0]
loss = -1.0 + (correct / float(number_of_test_data))
print("Loss: ", loss)
return loss
神经网络(不幸的是)训练私人数据,有12个输入神经元和12个输出神经元,我有43000个数据样本。 将kernel_constraint设置为maxnorm(3)的想法来自http://jmlr.org/papers/v15/srivastava14a.html,因为我遇到了几个NaN问题。
答案 0 :(得分:0)
我知道这个问题是2年前的,但是...
我的猜测是问题在于,您使用的是最终损失为“ sigmoid”(通常用于分类)且损失函数为“ mean_squared_error”(回归损失)
基于最终损失计算,您似乎正在尝试进行二进制分类。因此,也许可以尝试将损失函数更改为二进制交叉熵。