我是R的新手,我试图通过在R中使用神经网络包训练神经网络来预测测试数据集的Weekly_Sales。
我看过的数据(train1):
Store Dep Date Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI Unemployment IsHoliday Rank Weekly_Sales
1 1 5/2/2010 42.31 2.572 -2000 -500 -100 -500 -700 211.0963582 8.106 0 13 24924.50
1 1 12/2/2010 38.51 2.548 -2000 -500 -100 -500 -700 211.2421698 8.106 1 13 46039.49
1 1 19/02/2010 39.93 2.514 -2000 -500 -100 -500 -700 211.2891429 8.106 0 13 41595.55
1 1 26/02/2010 46.63 2.561 -2000 -500 -100 -500 -700 211.3196429 8.106 0 13 19403.54
1 1 5/3/2010 46.50 2.625 -2000 -500 -100 -500 -700 211.3501429 8.106 0 13 21827.90
1 1 12/3/2010 57.79 2.667 -2000 -500 -100 -500 -700 211.3501429 8.106 0 13 21827.90
数据分离
>ind<- sample(2,nrow(train1),replace= TRUE,prob=c(0.7,0.3))
>train <- train1[ind==1,]
>test <- train1 [ind==2,]
火车
>head(train)
Store Dept Date Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI Unemployment IsHoliday Rank Weekly_Sales
1 1 5/2/2010 42.31 2.572 -2000 -500 -100 -500 -700 211.0963582 8.106 0 13 24924.50
1 1 26-02-2010 46.63 2.561 -2000 -500 -100 -500 -700 211.3196429 8.106 0 13 19403.54
1 1 5/3/2010 46.50 2.625 -2000 -500 -100 -500 -700 211.3501429 8.106 0 13 21827.90
1 1 19-03-2010 54.58 2.720 -2000 -500 -100 -500 -700 211.2156350 8.106 0 13 22136.64
1 1 26-03-2010 51.45 2.732 -2000 -500 -100 -500 -700 211.0180424 8.106 0 13 26229.21
1 1 2/4/2010 62.27 2.719 -2000 -500 -100 -500 -700 210.8204499 7.808 0 13 57258.43
进行测试
>head(test)
Store Dept Date Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI Unemployment IsHoliday Rank Weekly_Sales
1 1 12/2/2010 38.51 2.548 -2000 -500 -100 -500 -700 211.2421698 8.106 1 13 46039.49
1 1 19-02-2010 39.93 2.514 -2000 -500 -100 -500 -700 211.2891429 8.106 0 13 41595.55
1 1 12/3/2010 57.79 2.667 -2000 -500 -100 -500 -700 211.3806429 8.106 0 13 21043.39
1 1 7/5/2010 72.55 2.835 -2000 -500 -100 -500 -700 210.3399684 7.808 0 13 17413.94
1 1 21-05-2010 76.44 2.826 -2000 -500 -100 -500 -700 210.6170934 7.808 0 13 14773.04
1 1 28-05-2010 80.44 2.759 -2000 -500 -100 -500 -700 210.8967606 7.808 0 13 15580.43
我使用的代码如下所示:
>library(neuralnet)
>n <-neuralnet(Weekly_Sales~Temperature+Fuel_Price+MarkDown1+MarkDown2+MarkDown3+MarkDown4+MarkDown5+CPI+Unemployment+IsHoliday+Rank,data= train,hidden=c(4,3),err.fct="sse",linear.output=FALSE)
>plot(n)
>output <- compute(n,test[,4:14])
>output1 <- output$net.result*(max(test$Weekly_Sales)-min(test$Weekly_Sales))+min(test$Weekly_Sales)
神经网络经过训练,显示的误差范围为10 ^ 13。我每次都得到相同的输出,我正在运行代码,这些预测甚至不接近测试数据中的实际Weekly_Sales。我已经使用了另一个部门的数据集,但仍然得到了相同的预测。
输出
>head(output$net.result)
[,1]
2 0.9999999998
3 0.9999999998
6 0.9999999998
14 0.9999999998
16 0.9999999998
17 0.9999999998
> head(output1)
[,1]
2 149743.97
3 149743.97
6 149743.97
14 149743.97
16 149743.97
17 149743.97
答案 0 :(得分:1)
在应用neuralnet()之前,需要规范化数据。因此,在将train1拆分为train / test之前,请使用下面的代码
maximum <- apply(train1, 2, max)
minimum <- apply(train1, 2, min)
train1_scaled <- as.data.frame(scale(train1, center=minimum, scale = maximum- minimum))
然后使用您的代码分割数据并使用以下功能
#linear.output should be TRUE as you are predicting continuos dependent variable
n <- neuralnet(Weekly_Sales~Temperature+Fuel_Price+MarkDown1+MarkDown2+MarkDown3+MarkDown4+MarkDown5+CPI+Unemployment+IsHoliday+Rank,data= train,hidden=c(4,3),err.fct="sse",linear.output=TRUE)
此代码后面的代码也需要稍作修改
#basically to convert it back to non-scaled version, you need to do it using non-scaled original data not 'test' dataset
output1 <- output$net.result*(max(train1$Weekly_Sales)-min(train1$Weekly_Sales))+min(train1$Weekly_Sales)
#also the dependent variable in test dataset will need conversion
test$Weekly_Sales_nonScaled <- test$Weekly_Sales*(max(train1$Weekly_Sales)-min(train1$Weekly_Sales))+min(train1$Weekly_Sales)
#After this you can compare original data (test$Weekly_Sales_nonScaled) with predicted data (output1)
请不要忘记告诉我们是否有帮助:)