提高预测算法的准确性

时间:2019-10-15 17:05:58

标签: machine-learning neural-network prediction

我是机器学习的新手,目前正在研究预测问题。我提供了一个具有很少数据列的excel电子表格的链接。

https://drive.google.com/file/d/1fWf6dX8kOCRB3GpX42AF6UvTmd0g9zXp/view?usp=sharing

我试图基于列A到E的值预测列F的值。下面给出了其代码

import numpy as np
import pandas as pn
from keras.layers import Dense, Activation
from keras.models import Sequential
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import linear_model
import matplotlib.pyplot as plt
dataset = pn.read_excel(r"G:\Machine learning\data\database.xlsx", "Sheet5")
dataset.columns = ['A','B','C','D','E','F']

print (dataset)
#check= dataset.iloc[0:,3 :13]
X = dataset.iloc[0:,0 :5]
print(X)
Y = dataset.iloc[0:, 5 :6]


print(Y)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.15, random_state = 0)
print(X_test)
print(Y_test)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
#
model = Sequential()
##
### Adding the input layer and the first hidden layer
model.add(Dense(32, activation = 'relu', input_dim = 5, kernel_initializer='normal'))
##
### Adding the second hidden layer
model.add(Dense(units = 16, activation = 'relu'))
model.add(Dense(units = 64, activation = 'relu'))
model.add(Dense(units = 8, activation = 'relu'))
model.add(Dense(units = 16, activation = 'relu'))
#model.add(Dense(units = 8, activation = 'linear'))
###
#### Adding the third hidden layer
#model.add(Dense(units = 16, activation = 'relu'))
#model.add(Dense(units = 16, activation = 'relu'))
#model.add(Dense(units = 16, activation = 'relu'))
##
### Adding the output layer
model.add(Dense(units = 1))
##
model.add(Dense(units = 1))
##
model.add(Dense(1))
### Compiling the ANN
model.compile(optimizer = 'nadam', loss = 'mean_squared_error',metrics= ['accuracy'])
##
### Fitting the ANN to the Training set
history = model.fit(X_train, Y_train, epochs=125, batch_size=5,  verbose=1, validation_split=0.1)
##
y_pred = model.predict(X_test)
##


y_pred1 = model.predict(X_train)
print (y_pred1)
Y_test.reset_index(drop= True, inplace= True)
print (Y_train)
Y_train.reset_index(drop= True, inplace= True)
plt.plot(y_pred1)
plt.plot(Y_train)
plt.show()
plt.plot(y_pred)
plt.plot(Y_test)
plt.show()
print (y_pred)
print (Y_test)
plt.plot((Y_test-y_pred)*100/Y_test)
plt.show()

我从这段代码中得到的拟合如下所示。 Fit 现在,当我预测时,在某些情况下错误会很大,如下所示 Prediction

有人可以指导我即兴编写代码,以获得更好的预测吗?



1 个答案:

答案 0 :(得分:0)

对于您提供的小型数据集,可能是神经网络模型过于复杂(4个隐藏层和最多64个神经元)的情况。

您可以尝试手动减少层数以查看精度是否提高。但是在将来,如果您想更实际地调整超参数或优化模型参数,则应考虑使用诸如随机/网格搜索,交叉验证和正则化的方法。

随机搜索:https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html