我有一个多层感知器用于多输出回归问题,可以预测14个连续值。以下是相同的代码片段:
# Parameters
learning_rate = 0.001
training_epochs = 1000
batch_size = 500
# Network Parameters
n_hidden_1 = 32
n_hidden_2 = 200
n_hidden_3 = 200
n_hidden_4 = 256
n_input = 14
n_classes = 14
# tf Graph input
x = tf.placeholder("float", [None, n_input],name="x")
y = tf.placeholder("float", [None, n_classes])
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1], 0, 0.1)),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2], 0, 0.1)),
'h3': tf.Variable(tf.random_normal([n_hidden_2, n_hidden_3], 0, 0.1)),
'h4': tf.Variable(tf.random_normal([n_hidden_3, n_hidden_4], 0, 0.1)),
'out': tf.Variable(tf.random_normal([n_hidden_4, n_classes], 0, 0.1))
}
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1], 0, 0.1)),
'b2': tf.Variable(tf.random_normal([n_hidden_2], 0, 0.1)),
'b3': tf.Variable(tf.random_normal([n_hidden_3], 0, 0.1)),
'b4': tf.Variable(tf.random_normal([n_hidden_4], 0, 0.1)),
'out': tf.Variable(tf.random_normal([n_classes], 0, 0.1))
}
# Create model
def multilayer_perceptron(x):
# Hidden layer with RELU activation
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.relu(layer_1)
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
layer_2 = tf.nn.relu(layer_2)
layer_3 = tf.add(tf.matmul(layer_2, weights['h3']), biases['b3'])
layer_3 = tf.nn.relu(layer_3)
layer_4 = tf.add(tf.matmul(layer_3, weights['h4']), biases['b4'])
layer_4 = tf.nn.relu(layer_4)
out_layer = tf.matmul(layer_4, weights['out']) + biases['out']
return out_layer
# Construct model
pred = multilayer_perceptron(x)
cost = tf.reduce_mean(tf.square(pred-y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Run the graph in the session
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
avg_cost = 0.
total_batch = int(total_len/batch_size)
for i in range(total_batch-1):
batch_x = X_train[i*batch_size:(i+1)*batch_size]
batch_y = Y_train[i*batch_size:(i+1)*batch_size]
_, c, p = sess.run([optimizer, cost, pred], feed_dict={x: batch_x, y: batch_y})
avg_cost += c / total_batch
输出:
x_batch_data:
[ 1.77560000e+04 4.00000000e+00 4.00000000e+00 ..., 1.00000000e+00
5.61000000e+02 1.00000000e+00]
[ 1.34310000e+04 4.00000000e+00 4.00000000e+00 ..., 1.00000000e+00
5.61000000e+02 1.00000000e+00]
[ 2.98800000e+03 1.00000000e+00 0.00000000e+00 ..., 0.00000000e+00
0.00000000e+00 1.00000000e+00]
y_batch_data:
[[ 4.19700000e-01 1.04298450e+02 1.50000000e+02 ..., 2.75250000e-01
1.02000000e-01 7.28565000e+00]
[ 5.59600000e-01 1.39064600e+02 2.00000000e+02 ..., 3.67000000e-01
1.36000000e-01 9.71420000e+00]
[ 2.79800000e-01 6.95323000e+01 1.00000000e+02 ..., 1.83500000e-01
6.80000000e-02 4.85710000e+00]
Prediction:
[[ 0.85085869 90.53585815 130.17015076 ..., 0.62335277
0.26637274 5.52062225]
[ 0.85085869 90.53585815 130.17015076 ..., 0.62335277
0.26637274 5.52062225]
[ 0.85085869 90.53585815 130.17015076 ..., 0.62335277
0.26637274 5.52062225]
尽管输入值不同,但预测值始终相同。有人可以指出这可能是背后的原因吗?
P.S提到的类似问题:tensorflow deep neural network for regression always predict same results in one batch
尝试的方法:
1.逐步将学习率从0.1降低到0.0001
2.尝试其他优化算法
3.更改了网络架构(隐藏节点和层的数量以及激活功能)
感谢任何帮助。
答案 0 :(得分:1)
问题似乎是:
如果权重保持不变且批次输入保持不变,则预测将保持不变。希望这有助于解决它。