MLPRegressor预测具有上限和下限

时间:2018-09-03 23:57:43

标签: python scikit-learn neural-network regression non-linear-regression

我对机器学习还很陌生,但是我正在尝试使用scikit-learn中的MLPRegressor来建模具有4个输入和1个输出的数据。数据集具有更多的输入和输出,但是我相信我选择了唯一适用于我选择的输出的输入和输出。我的数据集中有大约60,000个样本。该模型学习了大多数数据,但似乎在输出上有上限和下限。

我尝试了许多不同的超参数组合,但没有一个摆脱输出的明显界限。我曾尝试将数据标准化,但并没有真正的帮助。对于这组特定的超参数,损失为106.555,训练和测试数据的分数均为0.998。

这是代码:

# Importing the data from a .csv file
input_cols = [3,4,7,9]
output_cols = [8]
X, y, all_data = [], [], []
with open(data_path, 'r') as data:
    reader = csv.reader(data)
    i = 0
    for line in reader:
        try:
            a = [float(line[3]), float(line[4]), float(line[7]), float(line[10])]
            b = [float(line[9])]
            X.append(a)
            y.append(b)
        except ValueError:
            print('ValueError')
    all_data = [X, y]
print('Done importing data')

# Splitting the data for training and testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state = 42)

# Training the neural network
model = MLPRegressor(max_iter=10**4, verbose=True, hidden_layer_sizes=(10,10,10), tol=0.00001, learning_rate_init=0.005, random_state=1, \
                     activation='logistic', solver='adam')
print('Beginning training')
model.fit(X_train, y_train)
print('Training complete')

# Results
winsound.PlaySound('C:/Windows/media/Windows Background.wav', winsound.SND_FILENAME)
print('Score on training data: {:.3f}'.format(model.score(X_train, y_train)))
print('Score on testing data: {:.3f}'.format(model.score(X_test, y_test)))

以下是结果图。每个图是在x轴上绘制的4个输入和在y轴上绘制的输出之一。红色是网络的预测,蓝色是实际数据。出于隐私原因,我必须删除这些轴,但是知道y轴的范围是-700到700。 Results plots

0 个答案:

没有答案