我一直试图通过使用Tensor Flow来解决预测/回归问题,但我遇到了一些问题。在我解释我的真正问题之前,让我给你一些背景
我一直在玩的数据是一组5个功能,让他们称之为 [f1,f2,f3,f4,f5] ,以某种方式表示已确定的特定现象以实际价值(目标)
我一直在尝试做的是训练一个多层感知器来学习特征和目标值之间的关系。简而言之,我想根据在训练阶段看到的神经网络来预测真实值。
我已将此问题确定为预测/回归问题,并记下以下代码:
#picking device to run on
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
# Parameters
learning_rate = 0.001
training_epochs = 99999
batch_size = 4096
STDDEV = 0.1
# Network Parameters
n_hidden_1 = 10 # 1st layer number of neurons
n_hidden_2 = 10 # 2nd layer number of neurons
n_hidden_3 = 10 # 3nd layer number of neurons
n_hidden_4 = 10 # 4nd layer number of neurons
n_hidden_5 = 10 # 5nd layer number of neurons
n_input = 5 # number of features
n_classes = 1 # one target value (float)
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
# LOADING DATA
data_train = loader.loadDataset(dir_feat, train_path, 'TRAIN', features)
data_test = loader.loadDataset(dir_feat, test_path, 'TEST', features)
valid_period = 5
test_period = 10
def multilayer_perceptron(x, weights, biases):
# Hidden layer with sigmoid activation
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.sigmoid(layer_1)
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
layer_2 = tf.nn.sigmoid(layer_2)
layer_3 = tf.add(tf.matmul(layer_2, weights['h3']), biases['b3'])
layer_3= tf.nn.sigmoid(layer_3)
layer_4 = tf.add(tf.matmul(layer_3, weights['h4']), biases['b4'])
layer_4 = tf.nn.sigmoid(layer_4)
layer_5 = tf.add(tf.matmul(layer_4, weights['h5']), biases['b5'])
layer_5 = tf.nn.sigmoid(layer_5)
# Output layer with linear activation
out = tf.matmul(layer_5, weights['out']) + biases['out']
return out
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1],stddev=STDDEV)),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2],stddev=STDDEV)),
'h3': tf.Variable(tf.random_normal([n_hidden_2, n_hidden_3],stddev=STDDEV)),
'h4': tf.Variable(tf.random_normal([n_hidden_3, n_hidden_4],stddev=STDDEV)),
'h5': tf.Variable(tf.random_normal([n_hidden_4, n_hidden_5],stddev=STDDEV)),
'out': tf.Variable(tf.random_normal([n_hidden_5, n_classes],stddev=STDDEV))
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'b2': tf.Variable(tf.random_normal([n_hidden_2])),
'b3': tf.Variable(tf.random_normal([n_hidden_3])),
'b4': tf.Variable(tf.random_normal([n_hidden_4])),
'b5': tf.Variable(tf.random_normal([n_hidden_5])),
'out': tf.Variable(tf.random_normal([n_classes]))
}
# Construct model
pred = multilayer_perceptron(x, weights, biases)
def RMSE():
return tf.sqrt(tf.reduce_mean(tf.square(y - pred)))
cost = RMSE()
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
# Initializing the variables
init = tf.initialize_all_variables()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(1, training_epochs):
avg_cost = 0.
avg_R_square_train = []
train_dataset = loader.Dataset(data=data_train, batch_size=batch_size, num_feats=n_input)
total_batch = train_dataset.getNumberBatches()
# Loop over all batches
for i in range(total_batch):
batch_x, batch_y = train_dataset.next_batch(update=True)
# Run optimization op (backprop) and cost op (to get loss value)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
c_train = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
# Compute average loss
avg_cost += c_train / total_batch
print("Epoch:" + str(epoch) + ", TRAIN_loss = {:.9f}".format(avg_cost))
# TESTING
if epoch % test_period == 0:
c_test = sess.run(cost, feed_dict={x: data_test[0][0], y: data_test[0][1]})
print("Epoch:" + str(epoch) + ", TEST_loss = {:.9f}".format(c_test))
我遇到的问题是测试集的成本函数(在一些迭代之后)陷入局部最小值并且不再减少。
Epoch:6697, TRAIN_loss = 2.162182076
Epoch:6698, TRAIN_loss = 2.156500859
Epoch:6699, TRAIN_loss = 2.157814605
Epoch:6700, TRAIN_loss = 2.160744122
Epoch:6700, TEST_loss = 2.301288128
Epoch:6701, TRAIN_loss = 2.139338647
...
Epoch:6709, TRAIN_loss = 2.166410744
Epoch:6710, TRAIN_loss = 2.162357884
Epoch:6710, TEST_loss = 2.301478863
Epoch:6711, TRAIN_loss = 2.143475396
...
Epoch:6719, TRAIN_loss = 2.145476401
Epoch:6720, TRAIN_loss = 2.150237552
Epoch:6720, TEST_loss = 2.301517725
Epoch:6721, TRAIN_loss = 2.151232243
...
Epoch:6729, TRAIN_loss = 2.163080522
Epoch:6730, TRAIN_loss = 2.160523321
Epoch:6730, TEST_loss = 2.301782370
...
Epoch:6739, TRAIN_loss = 2.156920952
Epoch:6740, TRAIN_loss = 2.162290675
Epoch:6740, TEST_loss = 2.301943779
...
我试图更改几个超级参数,例如隐藏层数和/或节点数,学习率,批量大小等,但情况不会发生变化所有。我也试过使用其他损失函数,如MAE,MSE。
实际上我拥有的数据样本数量大约是270,000。
有人可以建议我如何解决这个问题或者给我一些有用的建议吗?
提前致谢。
的Davide